Skip to content

Commit 42ae5fb

Browse files
committed
retest, QA, and add chapter 2 book excerpt
retest, QA, and add book excerpt
1 parent 196cd3f commit 42ae5fb

File tree

2 files changed

+83
-63
lines changed

2 files changed

+83
-63
lines changed

data_augmentation_with_python_chapter_1.ipynb

Lines changed: 45 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"metadata": {
55
"colab": {
66
"provenance": [],
7-
"authorship_tag": "ABX9TyP/ew1Uv62mJHN1xFzV2pd8",
7+
"authorship_tag": "ABX9TyNy8FkAGn4E4n9p2HthAnNz",
88
"include_colab_link": true
99
},
1010
"kernelspec": {
@@ -31,7 +31,7 @@
3131
{
3232
"cell_type": "markdown",
3333
"source": [
34-
"# Data Augmentation with Python, Chapter 1"
34+
"# 💚 Data Augmentation with Python, Chapter 1"
3535
],
3636
"metadata": {
3737
"id": "qtnHR_uG0m7Z"
@@ -40,17 +40,12 @@
4040
{
4141
"cell_type": "markdown",
4242
"source": [
43-
"## 🌻 Welcome to Chapter 1, section Programing Style and Pluto\n",
43+
"# 🌻 Welcome to Chapter 1, Data Augmentation Made Easy\n",
4444
"\n",
45-
" - GitHub access \n",
45+
"#### I am glad to see you using this Python Notebook. It is an integral part of the book. You are free to add new “code cells” to extend the functions, add your data, and explore new possibilities, such as downloading additional real-world datasets from the Kaggle website and coding the **Fun challenges**. I also encourage you to add “text cells” to keep your note in the Python notebook. \n",
4646
"\n",
47-
" - Object-Oriented \n",
48-
"\n",
49-
" - Full library variable name \n",
50-
"\n",
51-
" - Export to pure Python code \n",
52-
"\n",
53-
" - Coding companion "
47+
"####The book contains in-depth augmentation concepts and code explanations. I hope you enjoy reading the book and hacking code on this Python Notebook as much as I enjoy writing it. \n",
48+
"\n"
5449
],
5550
"metadata": {
5651
"id": "a5EqgaYW08Cm"
@@ -59,33 +54,52 @@
5954
{
6055
"cell_type": "markdown",
6156
"source": [
62-
"## GitHub Access (from Collab)"
63-
],
64-
"metadata": {
65-
"id": "0Mo-idJS3eDh"
66-
}
67-
},
68-
{
69-
"cell_type": "markdown",
70-
"source": [
71-
"1. From the Colab menu, click on \"File\"\n",
57+
"# 😀 Excerpt from Chapter 1, Data Augmentation Made Easy\n",
58+
"\n",
59+
"> In case you haven’t bought the book. Here is an excerpt from the first page of Chapter 1. The book is on the Amazon Book website: https://www.amazon.com/dp/1803246456\n",
60+
"\n",
61+
"\n",
62+
"Data augmentation is essential for developing a successful Deep Learning (DL) project. However, data scientists and developers often overlook this crucial step. It is not a secret that you will spend the majority of the project time gathering, cleaning, and augmenting the dataset in a real-world DL project. Thus, learning how to expand the dataset without purchasing new data is essential.This book covers standard and advanced techniques for extending the image, text, audio, and tabular dataset. Furthermore, there is a discussion on data biases, and the coding lessons are on Jupyter Python Notebooks. \n",
63+
"\n",
64+
"Chapter 1 introduces the data augmentation concepts, sets up the coding environment, creates the foundation class, and later chapters explain techniques in detail, including Python coding. The effective use of data augmentation is the proven technique between success and failure in Machine Learning (ML). Many real-world ML projects stay in the conceptual phase because of insufficient data for training the ML model. Data augmentation is a cost-effective technique to increase the dataset, lower the training error rate, and produce a more accurate prediction and forecast. \n",
65+
"\n",
66+
"\n",
67+
">**Fun fact**\n",
68+
"\n",
69+
">The car gasoline analogy is helpful for students who first learn about data augmentation and AI. You can think of data for the AI engine as the gasoline and data augmentation as the additive, like the Chevron Techron fuel cleaner, that makes your car engine run faster, smoother, and further without extra petrol.\n",
70+
"\n",
71+
"\n",
72+
"\n",
73+
"In Chapter 1, we’ll define the data augmentation role and the limitation of how much to extend the data without changing the data integrity. We’ll briefly discuss the different types of input data, such as image, text, audio, and tabular data, and the challenges in supplementing the data. Finally, we’ll set up the system requirements and the programming style in the accompanying Jupyter Python Notebook. \n",
7274
"\n",
73-
"1. Select \"Open Notebook\"\n",
75+
"I design this book to be a hands-on journey. It will be more effective to read a chapter, run the code, re-read the chapter’s part that confused you, and jump back to hacking the code until the concept or technique is firmly understood. \n",
7476
"\n",
75-
"1. Click on \"GitHub\" tab\n",
77+
"You are encouraged to change or add new code to the Python Notebooks. The primary purpose is interactive learning. Thus, if something goes horribly wrong, download a fresh copy from the book GitHub. The surest method to learn is to make mistakes and create something new. \n",
7678
"\n",
77-
"1. Enter \"https://github.com/PacktPublishing/data-augmentation-with-python\" in the \"Repository\" field. \n",
79+
"Data augmentating is an iterative process. There is no fixed recipe. In other words, depending on the dataset, you select augmented functions and jiggle the parameters. A subject domain expert may provide insight into how much distortion is acceptable. By the end of Chapter 1, you will learn the general rules for data augmentation, what type of input data can be augmented, the programming style, and how to set up a Python Jupyter Notebook online or offline. \n",
7880
"\n",
79-
"1. You should see all Notebooks available."
81+
"In particular, Chapter 1 will cover the following primary topics: \n",
82+
"\n",
83+
"- Data augmentation role \n",
84+
"\n",
85+
"- Data input types \n",
86+
"\n",
87+
"- Python Jupyter Notebook \n",
88+
"\n",
89+
"- Programing styles \n",
90+
"\n",
91+
"Let’s start with augmentation role. \n",
92+
"\n",
93+
"[end of excerpt]"
8094
],
8195
"metadata": {
82-
"id": "hJ6ydtag4Nvc"
96+
"id": "0Mo-idJS3eDh"
8397
}
8498
},
8599
{
86100
"cell_type": "markdown",
87101
"source": [
88-
"## Object-Oriented"
102+
"## Programming styles"
89103
],
90104
"metadata": {
91105
"id": "ukspBUMp4N97"
@@ -106,8 +120,6 @@
106120
{
107121
"cell_type": "code",
108122
"source": [
109-
"# url = 'https://github.com/duchaba/Data-Augmentation-with-Python' # for testing: remove after the book is finished.\n",
110-
"\n",
111123
"url = 'https://github.com/PacktPublishing/Data-Augmentation-with-Python'\n",
112124
"!git clone {url}"
113125
],
@@ -270,7 +282,8 @@
270282
{
271283
"cell_type": "code",
272284
"source": [
273-
"# end of chapter 1"
285+
"# end of chapter 1\n",
286+
"print('End of chapter 1')"
274287
],
275288
"metadata": {
276289
"id": "FWpmCx6sJiFg"
@@ -329,8 +342,8 @@
329342
"# f = 'Data-Augmentation-with-Python'\n",
330343
"# os.chdir(f)\n",
331344
"# !git add -A\n",
332-
"# !git config --global user.email \"duc.haba@gmail.com\"\n",
333-
"# !git config --global user.name \"duchaba\"\n",
345+
"# !git config --global user.email \"duc.....com\"\n",
346+
"# !git config --global user.name \"duc.....\"\n",
334347
"# !git commit -m \"end of session\"\n"
335348
],
336349
"metadata": {
@@ -339,18 +352,6 @@
339352
"execution_count": null,
340353
"outputs": []
341354
},
342-
{
343-
"cell_type": "code",
344-
"source": [
345-
"# end of chapter 1\n",
346-
"print('End of chapter 1')"
347-
],
348-
"metadata": {
349-
"id": "zeFM8XUHc1c7"
350-
},
351-
"execution_count": null,
352-
"outputs": []
353-
},
354355
{
355356
"cell_type": "markdown",
356357
"source": [

data_augmentation_with_python_chapter_2.ipynb

Lines changed: 38 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -25,28 +25,31 @@
2525
"id": "a5EqgaYW08Cm"
2626
},
2727
"source": [
28-
"## 🌻 Welcome to Chapter 2, \"Biases and Data Augmentation\"\n"
29-
]
30-
},
31-
{
32-
"cell_type": "markdown",
33-
"metadata": {
34-
"id": "1qKNeMfdruTQ"
35-
},
36-
"source": [
37-
"In this chapter, we’ll cover the following primary topics. \n",
28+
"# 🌻 Welcome to Chapter 2, Biases In Data Augmentation\n",
3829
"\n",
39-
"- Computational Biases \n",
30+
"#### I am glad to see you using this Python Notebook. It is an integral part of the book. You are free to add new “code cells” to extend the functions, add your data, and explore new possibilities, such as downloading additional real-world datasets from the Kaggle website and coding the **Fun challenges**. I also encourage you to add “text cells” to keep your note in the Python notebook. \n",
4031
"\n",
41-
"- Human Biases \n",
32+
"####The book contains in-depth augmentation concepts and code explanations. I hope you enjoy reading the book and hacking code on this Python Notebook as much as I enjoy writing it. \n",
4233
"\n",
43-
"- Systemic Biases \n",
34+
"## ⭐\n",
4435
"\n",
45-
"- Deep Dive to Image Augmentation Biases \n",
36+
"- The book is on the Amazon Book website: https://www.amazon.com/dp/1803246456\n",
4637
"\n",
47-
"- Deep Dive to Text Augmentation Biases "
38+
"- The original Python Note is on: https://github.com/PacktPublishing/Data-Augmentation-with-Python/blob/main/data_augmentation_with_python_chapter_2.ipynb \n"
4839
]
4940
},
41+
{
42+
"cell_type": "markdown",
43+
"source": [
44+
"# 😀 Excerpt from Chapter 2, Biases In Data Augmentation\n",
45+
"\n",
46+
"> In case you haven’t bought the book. Here is an excerpt from the first page of Chapter 2.\n",
47+
"\n"
48+
],
49+
"metadata": {
50+
"id": "Zr2G6GSmF_za"
51+
}
52+
},
5053
{
5154
"cell_type": "markdown",
5255
"source": [
@@ -216,11 +219,14 @@
216219
{
217220
"cell_type": "markdown",
218221
"source": [
219-
"# Get Kaggle ID, key, and setup\n",
222+
"# Get Kaggle ID, key, and setup\n",
220223
"\n",
221224
"✋ STOP\n",
222225
"\n",
223-
"- First, sign up on kaggle.com. Get username and api key (refer to the book, Chapter 2)"
226+
"1. First, sign up on kaggle.com. Get username and api key (refer to the book, Chapter 2)\n",
227+
"\n",
228+
"2. Second, You MUST join the State Farm Distracted Driver Competition on the Kaggle website to download the data.\n",
229+
" - Go to: https://www.kaggle.com/competitions/state-farm-distracted-driver-detection/overview and click on the \"Join Competition\" button"
224230
],
225231
"metadata": {
226232
"id": "Y9azxfCb7tu4"
@@ -361,6 +367,19 @@
361367
"import os"
362368
]
363369
},
370+
{
371+
"cell_type": "markdown",
372+
"source": [
373+
"✋ STOP\n",
374+
"\n",
375+
"- You must join the State Farm Kaggle competition to download the data.\n",
376+
"\n",
377+
"- Go to: https://www.kaggle.com/competitions/state-farm-distracted-driver-detection/overview and click on the \"Join Competition\" button"
378+
],
379+
"metadata": {
380+
"id": "msyHmKjHHBx8"
381+
}
382+
},
364383
{
365384
"cell_type": "code",
366385
"execution_count": null,
@@ -558,7 +577,7 @@
558577
" fn = f'{dname}/pluto{self.fname_id}{format}'\n",
559578
" else:\n",
560579
" fn = fname\n",
561-
" canvas.savefig(fn, cmap=\"Greys\", bbox_inches=\"tight\", pad_inches=0.25)\n",
580+
" canvas.savefig(fn, bbox_inches=\"tight\", pad_inches=0.25)\n",
562581
" return\n",
563582
"#\n",
564583
"@add_method(PacktDataAug)\n",
@@ -1553,7 +1572,7 @@
15531572
"accelerator": "GPU",
15541573
"colab": {
15551574
"provenance": [],
1556-
"authorship_tag": "ABX9TyOCKRjIV1WBzvCeqW9gwOq+",
1575+
"authorship_tag": "ABX9TyMiRx8sTByugrd2/wq4Kjzk",
15571576
"include_colab_link": true
15581577
},
15591578
"gpuClass": "standard",

0 commit comments

Comments
 (0)