I am just learning Python, and trying to load some records into an excel. Here is the code I am running.
# Install the openpyxl module
#%pip install openpyxl
# Import pandas
import pandas as pd
import os
df = spark.sql("""
select distinct BILLING_ACCOUNT_NUMBER from prd_pdt_general_scm.scm_core.subscriber_soc
where SOC_CODE='P360HT'
and SUBSCRIBER_NUMBER ='0000000000'
and curr_ind='Y'
""")
#display(df) # this is showing the results correctly.
# Convert the Spark DataFrame to a Pandas DataFrame
pandas_df = df.toPandas()
# Define the directory and file path
directory = r'C:/Users/achatte17/Documents/ISE'
excel_file_path = os.path.join(directory, 'output.xlsx')
display(excel_file_path) #Also showing the path correctly
# Create the directory if it does not exist
os.makedirs(directory, exist_ok=True)
# Save the DataFrame to an Excel file
writer = pd.ExcelWriter(excel_file_path)
pandas_df.to_excel(writer, index = False)
writer.save()
writer.close()
# Get the absolute path
file_path = os.path.abspath('output.xlsx')
# Print the file location
print(f'The location of the Excel file is: {file_path}') #this is showing a different path as /home/spark-d377ec0c-caf2-4f42-9dc7-34/output.xlsx
I tried multiple ways of providing the directory, also tried to write in different ways. there is no error message, but I am not able to find the excel.
I am just learning Python, and trying to load some records into an excel. Here is the code I am running.
# Install the openpyxl module
#%pip install openpyxl
# Import pandas
import pandas as pd
import os
df = spark.sql("""
select distinct BILLING_ACCOUNT_NUMBER from prd_pdt_general_scm.scm_core.subscriber_soc
where SOC_CODE='P360HT'
and SUBSCRIBER_NUMBER ='0000000000'
and curr_ind='Y'
""")
#display(df) # this is showing the results correctly.
# Convert the Spark DataFrame to a Pandas DataFrame
pandas_df = df.toPandas()
# Define the directory and file path
directory = r'C:/Users/achatte17/Documents/ISE'
excel_file_path = os.path.join(directory, 'output.xlsx')
display(excel_file_path) #Also showing the path correctly
# Create the directory if it does not exist
os.makedirs(directory, exist_ok=True)
# Save the DataFrame to an Excel file
writer = pd.ExcelWriter(excel_file_path)
pandas_df.to_excel(writer, index = False)
writer.save()
writer.close()
# Get the absolute path
file_path = os.path.abspath('output.xlsx')
# Print the file location
print(f'The location of the Excel file is: {file_path}') #this is showing a different path as /home/spark-d377ec0c-caf2-4f42-9dc7-34/output.xlsx
I tried multiple ways of providing the directory, also tried to write in different ways. there is no error message, but I am not able to find the excel.
Try setting the directory to the current directory itself, so you can just see if it is properly saved.
directory = r'./'
excel_file_path = os.path.join(directory, 'output.xlsx')
C:/Users/achatte17/Documents/ISE/output.xlsx
? – 101 Commented Feb 4 at 2:57file_path = os.path.abspath('output.xlsx')
which is where the python script is running (unless of course they are the same directory). Even if the pathC:/Users/achatte17/Documents/ISE
didn't exist the program would create it, although it's possible you may get an 'access denied' creating files and paths in 'C:\Users' but that event should give such an error. – moken Commented Feb 4 at 4:04