python - Pandas displaying the first row but not indexing it - Stack Overflow

admin2025-05-01  2

I have a large text file, with a header of 18 lines.

If I try to display the entire dataframe:

df = pd.read_csv('my_log')
print(df)

I get: pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 19, saw 3

If I try to use exclude the header:

df = pd.read_csv('my_log', header=18)

I get the first row (line 19), then the second row (showing indexed at 0) No matter which index number I use in: print(df.loc[[0]]), I always get that first row displayed (no index number) before the row that I want.

I've checked out the text file, and every row ends in a CR/LF. I've also completely removed line 19; but, the same behavior occurs.

Also, if I completely remove the header and print the entire dataframe, I still get the same behavior. The first row prints (without an index number) and the row count is 1 less than the true row count.

Any suggestions greatly appreciated!

I have a large text file, with a header of 18 lines.

If I try to display the entire dataframe:

df = pd.read_csv('my_log')
print(df)

I get: pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 19, saw 3

If I try to use exclude the header:

df = pd.read_csv('my_log', header=18)

I get the first row (line 19), then the second row (showing indexed at 0) No matter which index number I use in: print(df.loc[[0]]), I always get that first row displayed (no index number) before the row that I want.

I've checked out the text file, and every row ends in a CR/LF. I've also completely removed line 19; but, the same behavior occurs.

Also, if I completely remove the header and print the entire dataframe, I still get the same behavior. The first row prints (without an index number) and the row count is 1 less than the true row count.

Any suggestions greatly appreciated!

Share Improve this question edited Jan 3 at 8:48 samhita 4,1252 gold badges11 silver badges18 bronze badges asked Jan 2 at 22:15 yodishyodish 8014 gold badges13 silver badges30 bronze badges 7
  • Is it really a CSV file? It should have the same number of fields on every line. – Barmar Commented Jan 2 at 22:28
  • Don't use pd.read_csv() for regular text files. Just use file.readlines() to read it into a list of lines. – Barmar Commented Jan 2 at 22:29
  • 1 Yes you can try the readlines, another approach you can try to skip the rows and define the columns , for example df = pd.read_csv(file_path, skiprows=18, names=column_names) and then reset the index df.reset_index(drop=True, inplace=True) , otherwise need some processing by looping through – samhita Commented Jan 2 at 22:35
  • It is not really a csv file, I will try both these suggestions. Thanks! – yodish Commented Jan 2 at 22:43
  • @samhita, this was the solution; thanks! – yodish Commented Jan 3 at 1:13
 |  Show 2 more comments

1 Answer 1

Reset to default 1

One approach is to skip the rows and define the columns

column_names = ['Column1', 'Column2', 'Column3']
df = pd.read_csv(file_path, skiprows=18, names=column_names)
转载请注明原文地址:http://www.anycun.com/QandA/1746094520a91587.html