This book introduces powerful command line utilities for creating efficient shell scripts to process datasets. Using the bash shell, the examples and scripts focus on small datasets to help readers understand the features of grep, sed, and awk. Companion files with code are available for download from the publisher.
The course starts with an introduction to the basics, covering files and directories, and useful commands. It then progresses to conditional logic and loops, providing a solid foundation for processing datasets. Detailed chapters on using grep, sed, and awk illustrate their capabilities in handling and cleaning various types of datasets effectively.
Advanced topics include processing datasets with Pandas, exploring NoSQL, SQLite, and Python. The book equips data scientists, analysts, and anyone seeking shell-based solutions with practical skills. By the end, users will be adept at creating robust scripts for dataset processing, combining command line utilities for optimal results.
The course starts with an introduction to the basics, covering files and directories, and useful commands. It then progresses to conditional logic and loops, providing a solid foundation for processing datasets. Detailed chapters on using grep, sed, and awk illustrate their capabilities in handling and cleaning various types of datasets effectively.
Advanced topics include processing datasets with Pandas, exploring NoSQL, SQLite, and Python. The book equips data scientists, analysts, and anyone seeking shell-based solutions with practical skills. By the end, users will be adept at creating robust scripts for dataset processing, combining command line utilities for optimal results.