Skip to main content
Back to blog
tools 5 February 2020 2 min read

Linux Data Manipulation

Essential Linux commands for file creation and manipulation, including seq, rev, for loops, sed, and split for handling large data files.

M

Mark

Performance Testing Expert

Linux is a preferred tool for file creation and manipulation. It can manage larger files more efficiently compared to Windows alternatives.

Key Commands

SEQ Command

Creates number sequences with the syntax: starting number, increment value, maximum value.

# Generate numbers 1 to 10
seq 1 10

# Generate numbers 1 to 100 with increment of 5
seq 1 5 100

# Generate with leading zeros
seq -w 01 10

REV Command

Reverses strings. Useful for complex string manipulations when combined with other commands.

# Reverse a string
echo "hello" | rev
# Output: olleh

# Transform filename-01.csv to filename-02.csv
echo "filename-01.csv" | rev | cut -d'-' -f1 | rev | sed 's/01/02/'

FOR Command

Loops through command blocks. Demonstrates iterating through sequences.

# Loop through 1 to 10
for i in $(seq 1 10); do
    echo "Processing file $i"
done

# Loop through files
for file in *.csv; do
    echo "Found: $file"
done

SED Command

Modifies file content. Extremely powerful for text transformations.

# Add a header row to a CSV file
sed -i '1i column1,column2' filename01.csv

# Replace text in a file
sed -i 's/old_text/new_text/g' filename.csv

# Delete first line
sed -i '1d' filename.csv

SPLIT Command

Fragments large files into smaller chunks. Essential for handling million-row datasets.

# Split file into 10,000 line chunks
split -l 10000 largefile.csv

# This creates output files: xaa, xab, xac, etc.

# Split with custom prefix
split -l 10000 largefile.csv chunk_

# This creates: chunk_aa, chunk_ab, chunk_ac, etc.

Combining Commands

These commands become powerful when combined:

# Create numbered CSV files with headers
for i in $(seq -w 01 10); do
    echo "column1,column2,column3" > "file-$i.csv"
    echo "data1,data2,data3" >> "file-$i.csv"
done

# Process all CSV files
for file in *.csv; do
    sed -i '1i new_header' "$file"
done

Further Reading

Tags:

#linux #bash #scripting #data-manipulation #command-line

Need help with performance testing?

Let's discuss how I can help improve your application's performance.

Get in Touch