I'd like to remove a few lines from a csv file.
The rules are simple enough (Keep line if):
- It's the first line in the file.
- The first value is different from the first value of the previous row.
- The second value has increased by at least 10 from the previous kept line.
Source
Test1, 0.0, 1
Test1, 0.2, 1
Test1, 10.0, 3
Test2, 0.1, 1
Test2, 0.3, 3
Test2, 1.0, 5
Test2, 11.0, 7
Result
Test1, 0.0, 1
Test1, 10.0, 3
Test2, 0.1, 1
Test2, 11.0, 7
I was thinking of doing this with awk and a few if statements, but I'm not certain if I can make a variable that is transferred between record processing.
EDIT: this was hidden in the comments section (from me):
I just found out that the variables are usable between records. Something that doesn't work quite like C. I'll remove this question unless someone gives an answer that I deem usable for others or someone asks me to provide the answer.
Since it's tagged with awk
awk -F", *" 'x!=$1||$2>=y+10{y=$2;print}{x=$1}' file
Test1, 0.0, 1
Test1, 10.0, 3
Test2, 0.1, 1
Test2, 11.0, 7
No comments:
Post a Comment