Table Of Contents
Excel to CSV UTF8
If Excel contains any smart quotes or other language data while converting it from Excel to CSV, it would not convert accurately and get question marks. So while importing the data, we also get the same things. However, using Excel CSV UTF8 as the encoding, we can easily choose the file format and language.
What is UTF8 Encoding?
Unicode Transformation Format is the format for converting any character to Unicode standard that can be as compact as ASCII.
ASCII: American Standard Code for Information Interchange
CSV format files have been there for long years. And all we know is just the basic level of exporting and importing the data. But in addition to that, some advanced techniques are also there. For example, data with special characters need a different procedure to convert the data from Excel to CSV. This article will show you how to deal with special characters with an Excel CSV file, i.e., CSV UTF8.
Example of UTF8 Encoding in Excel CSV File
For example, look at the below data in the Excel worksheet.
These are the names in Japanese, and we have proper English meanings for each of the words here. Now let us convert the file to CSV format.
Follow the steps to convert Excel to CSV file format.
-
Click the "Save As" button or press the F12 shortcut in Excel.
- Choose the destination folder to save the file as a "CSV" file.
- Change the "Save As" file type to "CSV (Comma delimited)."
- Click on the “Save” button, and our data will be saved as a "CSV" file.
With the file extension, we can easily view the file format.
Look at the above files, which are named the same, but the file extension is “csv” and “.xlsm,” so this shows that both the file formats are different.
Now, open the CSV file and see what we get.
Look at the values in columns B and C (where we had Japanese words), we have got question marks instead of actual words, and when we try importing the data to excel, we still get question marks only.
So this is the problem with special character data conversion to CSV file. To avoid this, we need to follow different procedures below the procedure.
Save File As Unicode Text Format
Before we save the data in the CSV file format, we need to first save it as "Unicode Text (*.txt)," then convert it to a CSV file, following the below steps.
Step 1: Once again, we must open the "Save As" window, and this time, choose the file format as "Unicode Text (*.Txt)" format.
Step 2: Click on the "Save" button, and it will save the file in "Unicode" format.
We have three format files: "Excel," "CSV," and "Unicode Text Format."
Now, open a "Unicode Text" file and see how the data looks.
Data looks as it is in the text file.
We need to save the above file as an encoding of the UTF-8 method to get the correct data into Excel. So now, insert the delimiter for the above data by replacing the space characters. Copy the first space character.
Now open "Find & Replace in Excel" and replace the copied space with a comma (,).
Click on "Replace All." A comma will replace all the space characters.
Now, choose the "Save As" option.
Subsequently, this will open up the "Save As" window. In that window, choose the "Encoding" option as "UTF-8."
Click on the "Save" button to complete the save option.
Now, change the file extension from “.txt” to “.csv.”
Now open the CSV file, and we should get the correct Japanese words instead of the question mark.
Things to Remember
- UTF converts special character words through encoding.
- Save the file as "Unicode" before converting it to "CSV" and encoding it as "UTF-8."