Answer the question
In order to leave comments, you need to log in
Is there a correct way to process Cyrillic texts using awk?
There are two files: country.csv and president.csv
country.csv has two columns: 1) Country name; 2) The population of
president.csv is also two columns: 1) Country name; 2) Its president's name
A semicolon is used as a separator.
You need to get a third file (or add a column to the first one - that's not the point), where all three fields will be in one line: Country name; Number of population; President's name.
The number of lines in the files is different, i.e. some countries may not be in both the first and second file, i.e. just sorting and then blindly joining the column will not work. It is necessary to find the line with this value in the second file by the value of the first cell of the first file and take the value from the second column of this line.
I'm trying to do it with a script like this:
#!/bin/bash
while read LINE; do
C_NAME=$(echo $LINE | cut -d";" -f1)
awk -v country=$C_NAME -v line=$LINE -F";" '$1 == country {print line";"$2}' president.csv >>result.csv
done < country.csv
awk: cmd. line:1: Albania
awk: cmd. line:1: ^ invalid char '�' in expression
Answer the question
In order to leave comments, you need to log in
your code is correct, the gag is most likely in the data.
If you fill in csv somewhere, you can look at the thread in more detail.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question