Introduction
Welcome to the AWK Mastery Challenge. You have been dropped into a server with realistic, albeit messy, data.
- Log files are in
/root/awk_challenge/logs/access/. - HR CSV files are in
/root/awk_challenge/data/hr/. - System metrics are in
/root/awk_challenge/sys/metrics/.
Your mission is to perform 20 tasks ranging from basic extraction to complex SysAdmin reporting.
Phase 1: Fundamentals
Before you jump into your favorite AI chatbot, try to solve these tasks using only man awk and awk --help.
And write your own cheatsheet, its worth it!
Task 1: Basic Extraction
File: /root/awk_challenge/logs/access/access.log.
Goal: Extract only the IP Address (1st Column) and save it to /root/q1_ips.txt.
Task 2: CSV Delimiters
File: /root/awk_challenge/data/hr/employees.csv.
Goal: Extract Name (Col 2) and Department (Col 4). The file uses commas ,.
Save to /root/q2_names.txt.
Task 3: Dynamic Columns ($NF)
File: /root/awk_challenge/sys/metrics/processes.txt.
Goal: Extract the Command (The last column) regardless of messy spacing.
Save to /root/q3_commands.txt.
Task 4: String Formatting
File: /root/awk_challenge/data/hr/employees.csv.
Goal: Print the name prefixed with a label. Expected format: User: [Name].
Save to /root/q4_formatted.txt.
Task 5: Cleaning Messy Data
File: /root/awk_challenge/data/hr/sales_raw.txt.
Goal: Extract Product (Col 2) and Price (Col 4).
Warning: The file uses semicolons ; but has spaces around them (e.g. ;). You must strip these spaces during extraction.
Save to /root/q5_clean.txt.
Task 6: Numerical Filtering
File: /root/awk_challenge/logs/access/access.log.
Goal: Extract full lines where the Status Code (Col 9) is 500 or greater.
Save to /root/q6_errors.txt.
Task 7: String Filtering
File: /root/awk_challenge/data/hr/employees.csv.
Goal: Extract full lines where Department (Col 4) is exactly "Sales".
Save to /root/q7_sales.txt.
Phase 2: Logic & Regex
Task 8.1: Regex Inclusion & Exclusion
Task 8.1: Regex Inclusion & Exclusion
File: /root/awk_challenge/logs/access/access.log.
Goal: Find lines where User Agent contains Mozilla BUT does not contain Windows.
Save to /root/q8_1_filter.txt.
Task 8.2: Starts With (Anchor)
File: /root/awk_challenge/data/hr/employees.csv.
Goal: Find employees whose Name starts with the letter 'C'.
Save to /root/q8_2_starts.txt.
Task 8.3: Ends With (Anchor)
File: /root/awk_challenge/data/hr/employees.csv.
Goal: Find lines where Department ends with "ing" (e.g. Engineering).
Save to /root/q8_3_ends.txt.
Task 9: Logical AND
File: /root/awk_challenge/data/hr/employees.csv.
Goal: Find employees in "Support" who earn more than 60000.
Save to /root/q9_complex.txt.
Task 10: Calculations on the Fly
File: /root/awk_challenge/data/hr/employees.csv.
Goal: Print Name and Net Salary (Gross Salary * 0.8).
Save to /root/q10_calc.txt.
Task 11: Aggregation (Total Sum)
File: /root/awk_challenge/data/hr/employees.csv.
Goal: Calculate the Sum of all salaries. Print only the final number.
Save to /root/q11_total.txt.
Task 12: Row Ranges (NR)
File: /root/awk_challenge/data/hr/employees.csv.
Goal: Extract exactly lines 10 to 20 (inclusive).
Save to /root/q12_range.txt.
Task 13: Reporting (BEGIN/END)
File: /root/awk_challenge/data/hr/employees.csv.
Goal: Print "START" before reading the file, print the file content, and "END" after reading.
Save to /root/q13_report.txt.
Task 14: Output Field Separator
File: /root/awk_challenge/data/hr/employees.csv.
Goal: Print Name and Dept separated by an arrow -> instead of a space.
Save to /root/q14_ofs.txt.
Phase 3: SysAdmin & Advanced
Task 15: Parsing Config Files
File: /etc/passwd.
Goal: Extract the Username (1st col) for users who have /bin/bash as their shell (Last col).
Save to /root/q15_users.txt.
Task 16: Processing Network Stats
Run ss -lunt.
Goal: Extract only the Port Number from the Local Address column (e.g. 22 from 0.0.0.0:22).
Save to /root/q16_ports.txt.
Task 17: Disk Monitoring
Run df -P.
Goal: Print the Filesystem and the Use% only for lines where the filesystem name starts with /dev.
Bonus: Try to remove the % sign using 0+$5.
Save to /root/q17_disk.txt.
Task 18: String Length Analysis
File: /root/awk_challenge/logs/access/access.log.
Goal: Filter lines that are longer than 80 characters (To find potentially malicious long requests).
Save to /root/q18_long.txt.
Task 19: Frequency Count
File: /root/awk_challenge/logs/access/access.log.
Goal: Count how many requests came from each IP Address.
Format: IP COUNT.
Save to /root/q19_count.txt.
Task 20: Ternary Operator
File: /root/awk_challenge/data/hr/employees.csv.
Goal: Print Name and a Status.
Status logic: If Salary > 80000 print "High", else print "Standard".
Save to /root/q20_status.txt.
Level up your Server Side game — Join 20,000 engineers who receive insightful learning materials straight to their inbox