Table Of Contents

arrow

Excel to CSV UTF8

If Excel contains any smart quotes or other language data while converting it from Excel to CSV, it would not convert accurately and get question marks. So while importing the data, we also get the same things. However, using Excel CSV UTF8 as the encoding, we can easily choose the file format and language.

What is UTF8 Encoding?

Unicode Transformation Format is the format for converting any character to Unicode standard that can be as compact as ASCII.

ASCII: American Standard Code for Information Interchange

CSV format files have been there for long years. And all we know is just the basic level of exporting and importing the data. But in addition to that, some advanced techniques are also there. For example, data with special characters need a different procedure to convert the data from Excel to CSV. This article will show you how to deal with special characters with an Excel CSV file, i.e., CSV UTF8.

Excel-CSV-UTF8

Example of UTF8 Encoding in Excel CSV File

For example, look at the below data in the Excel worksheet.

Excel CSV UTF8 Example 1

These are the names in Japanese, and we have proper English meanings for each of the words here. Now let us convert the file to CSV format.

Follow the steps to convert Excel to CSV file format.

  1. Click the "Save As" button or press the F12 shortcut in Excel.

  2. Choose the destination folder to save the file as a "CSV" file.


    Excel CSV UTF8 Example 1-1

  3. Change the "Save As" file type to "CSV (Comma delimited)."


    Excel CSV UTF8 Example 1-2

  4. Click on the “Save” button, and our data will be saved as a "CSV" file.


    Excel CSV UTF8 Example 1-3
    With the file extension, we can easily view the file format.

    Look at the above files, which are named the same, but the file extension is “csv” and “.xlsm,” so this shows that both the file formats are different.

Now, open the CSV file and see what we get.

Excel CSV UTF8 Example 1-4

Look at the values in columns B and C (where we had Japanese words), we have got question marks instead of actual words, and when we try importing the data to excel, we still get question marks only.

So this is the problem with special character data conversion to CSV file. To avoid this, we need to follow different procedures below the procedure.

Save File As Unicode Text Format

Before we save the data in the CSV file format, we need to first save it as "Unicode Text (*.txt)," then convert it to a CSV file, following the below steps.

Step 1: Once again, we must open the "Save As" window, and this time, choose the file format as "Unicode Text (*.Txt)" format.

Excel CSV UTF8 Example 1-5

Step 2: Click on the "Save" button, and it will save the file in "Unicode" format.

Excel CSV UTF8 Example 1-6

We have three format files: "Excel," "CSV," and "Unicode Text Format."

Now, open a "Unicode Text" file and see how the data looks.

Excel CSV UTF8 Example 1-7

Data looks as it is in the text file.

We need to save the above file as an encoding of the UTF-8 method to get the correct data into Excel. So now, insert the delimiter for the above data by replacing the space characters. Copy the first space character.

Example 1-8 (Select Space)

Now open "Find & Replace in Excel" and replace the copied space with a comma (,).

Example 1-9 (Find)

Click on "Replace All." A comma will replace all the space characters.

Example 1-10 (Replace all)

Now, choose the "Save As" option.

Example 1-11 (Save As)

Subsequently, this will open up the "Save As" window. In that window, choose the "Encoding" option as "UTF-8."

Excel CSV UTF8 Example 1-12

Click on the "Save" button to complete the save option.

Example 1-13 (Save)

Now, change the file extension from “.txt” to “.csv.”

Example 1-14 (File Extension)

Now open the CSV file, and we should get the correct Japanese words instead of the question mark.

Example 1-15 (CSV File)

Things to Remember

  • UTF converts special character words through encoding.
  • Save the file as "Unicode" before converting it to "CSV" and encoding it as "UTF-8."