TensorFlow Course

From bibbleWiki
Jump to navigation Jump to search

Introduction

This is a consequence of doing the PyTorch course. I like to compare to a) re-enforce concepts b) have a choice to how to approach work. I chose freeCodeCamp this time from here. Before the middle of the course I changed direction and instead create ipynb to compare Tensor flow with PyTorch. This re-enforced my learning and can be found on my git site, tensorflow and pyTorch

Building for Source

My Software vs Required Software

Have a lot of trouble with CUDA and WSL. It suggested using 12.5.1 which I downloaded and still did not work. This is what I had and needed according to the robot

TensorFlow 2.20.0 Compatibility Overview
Component Your Version Required for TF 2.20.0
Python 3.12.3 3.9–3.13
Clang 22.0.0 18.1.8
Bazel None 7.4.1
CUDA Toolkit 13.0 12.5
NVIDIA Driver 581.42 ≥ 535.54.03
cuDNN 9.14.0 9.3

Software Used

For me I used the following Cuda and CuDNN

 cuda-repo-wsl-ubuntu-13-0-local_13.0.2-1_amd64.deb
 cudnn-local-repo-ubuntu2404-9.14.0_1.0-1_amd64.deb
 clang-22

bashrc changes

Having install the above I changed the following

export PATH=/usr/local/cuda-13.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-13.0/lib64:$LD_LIBRARY_PATH

Environemnt Variables

Here are the ones I used

export TF_CUDA_VERSION=13.0
export TF_CUDNN_VERSION=9.14
export TF_CUDA_COMPUTE_CAPABILITIES=12.0

export CUDA_TOOLKIT_PATH=/usr/local/cuda-13.0
export CUDNN_INSTALL_PATH=/usr/lib/x86_64-linux-gnu

export TF_ENABLE_ONEDNN_OPTS=1
export TF_NEED_CUDA=1
export TF_NEED_TENSORRT=0
export TF_NEED_ROCM=0
export TF_NEED_OPENCL_SYCL=0
export TF_NEED_MPI=0
export TF_NEED_NCCL=0
export TF_SET_ANDROID_WORKSPACE=0

export TF_CUDA_CLANG=1
export CLANG_CUDA_COMPILER_PATH=/usr/bin/clang
export GCC_HOST_COMPILER_PATH=/usr/bin/gcc

export CC_OPT_FLAGS="-march=native -O3"
export TF_CONFIGURE_IOS=0

# Optional: use your local JSON compatibility matrix
export TF_CUDA_VERSION_JSON_URL="http://localhost/cuda_13.0.json"

export CLANG_CUDA_COMPILER_PATH=/usr/bin/clang-22

Json File

I set up a json file in caddy containing

{
  "cuda_version": "13.0",
  "cudnn_version": "9.14.0",
  "compute_capabilities": ["12.0"],
  "cuda_paths": {
    "cuda_home": "/usr/local/cuda-13.0",
    "cuda_bin": "/usr/local/cuda-13.0/bin",
    "cuda_lib": "/usr/local/cuda-13.0/lib64",
    "cudnn_lib": "/usr/lib/x86_64-linux-gnu",
    "cudnn_include": "/usr/include"
  },
  "env": {
    "PATH": "/usr/local/cuda-13.0/bin:$PATH",
    "LD_LIBRARY_PATH": "/usr/local/cuda-13.0/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH",
    "CUDA_HOME": "/usr/local/cuda-13.0",
    "TF_CUDA_VERSION": "13.0",
    "TF_CUDNN_VERSION": "9.14",
    "TF_CUDA_COMPUTE_CAPABILITIES": "12.0",
    "TF_ENABLE_ONEDNN_OPTS": "1"
  },
  "bazel_flags": [
    "--config=cuda",
    "--copt=-DGOOGLE_CUDA=1",
    "--copt=-DTF_CUDA_COMPUTE_CAPABILITIES=12.0",
    "--copt=-O3",
    "--copt=-march=native",
    "--define=using_cuda=true",
    "--define=using_cuda_nvcc=true"
  ],
  "notes": [
    "Ensure that cudnn_version.h confirms 9.14.0",
    "Verify that texture_fetch_functions.h is present in /usr/local/cuda-13.0/include/crt/",
    "If using Clang toolchain, ensure CUDA headers are explicitly included via --copt",
    "Consider switching to GCC if Clang fails to resolve CUDA device headers"
  ]
}

You can test with

curl http://localhost/cuda_13.0.json

In my case it no longer worked due to systemctl locking up so I am unable to know if this was used but the contents are here.

Configure

For the ./configure I hit return on the questions except Please specify a list of comma-separated CUDA compute capabilities you want to build with which I answered 12.0. I am told that options are written to .tf_configure.bazelrc so I provide mine here

build --action_env PYTHON_BIN_PATH="/usr/bin/python3"
build --action_env PYTHON_LIB_PATH="/usr/lib/python3/dist-packages"
build --python_path="/usr/bin/python3"
build:cuda --repo_env HERMETIC_CUDA_COMPUTE_CAPABILITIES="12.0"
build --action_env LD_LIBRARY_PATH="/usr/local/cuda-13.0/lib64:"
build --config=cuda_clang
build --action_env CLANG_CUDA_COMPILER_PATH="/usr/lib/llvm-22/bin/clang"
build --config=cuda_clang
build:opt --copt=-march=native
build:opt --host_copt=-march=native
build:opt --copt=-O3
build:opt --host_copt=-O3
test --test_size_filters=small,medium
test --test_env=LD_LIBRARY_PATH
test:v1 --test_tag_filters=-benchmark-test,-no_oss,-oss_excluded,-no_gpu,-oss_serial
test:v1 --build_tag_filters=-benchmark-test,-no_oss,-oss_excluded,-no_gpu
test:v2 --test_tag_filters=-benchmark-test,-no_oss,-oss_excluded,-no_gpu,-oss_serial,-v1only
test:v2 --build_tag_filters=-benchmark-test,-no_oss,-oss_excluded,-no_gpu,-v1only

Building

Then I was able to run this which takes forever 1-2 hours guessing

bazel build --config=cuda_wheel //tensorflow/tools/pip_package:wheel

Getting Started

More Maths

There is no end to maths. Each time I think I am fairly au fait with it, along comes more. So this time fourth root. So we, even I, know what a square root is

Square root of 4 = 2 or 2^2 = 4
Square root of 9 = 3 or 3^2 = 9

Now lets look at the fourth root. What number when raised by the power of 4 = 100. It turns out that it is 3.1622776601683793319988935444327. i.e. 3.1622776601683793319988935444327 ^ 4 = 100 and therefore known as the fourth root. Now the reason I am interested in this is because we calculate embedding_dims using

min(8, max(2, floor(v ** 0.25)))

So when v = 100 we have

= min(8, max(2, floor(v^0.25))) 
= min(8, max(2, floor(3.16)))
= min(8, max(2, 3) 
= min(8, 3) 
= 3

Digging deeper the robot used the word heuristic which does not mean much to me but appears to mean something someone found worked.

Embeddings

So this is just a way to turn data into Tensors

# Step 1: Define your city vocabulary
cities = ['London', 'Glasgow', 'Dunedin']
city_input = tf.constant(['London', 'Glasgow', 'Dunedin', 'Unknown'])

# Step 2: Create a StringLookup layer
lookup = tf.keras.layers.StringLookup(vocabulary=cities, output_mode='int', mask_token=None)

# Step 3: Convert city names to integer indices
city_indices = lookup(city_input)
print("City indices:", city_indices.numpy())

# Step 4: Create an Embedding layer
# input_dim = vocab size + 1 for OOV
embedding_dim = 3
embedding = tf.keras.layers.Embedding(input_dim=len(cities) + 1, output_dim=embedding_dim)

# Step 5: Map indices to embedding vectors
city_vectors = embedding(city_indices)
print("Embedding vectors:\n", city_vectors.numpy())

This produces

City indices: [1 2 3 0]
Embedding vectors:
 [[-0.01283479  0.02066988  0.00210273]
 [-0.02549194  0.02811767  0.04098605]
 [ 0.01020243 -0.0374084   0.00962119]
 [-0.04021025 -0.01305205 -0.04124854]]

Some Terminology

  • Rank is the same as dim
  • eval is the same as test

Tensors

Much like rand we can make tensors

# Make 5x5
t = tf.zeros([5,5])

# No different to pyTorch
print(f"Shape: {t.shape}")
print(f"Device: {t.device}")
print(f"Device: {t.dtype}")

# Reshape
t = tf.reshape(t, [1,25])
print(f"Reshape: {t}")

# Reshape
t = tf.reshape(t, [1,1,25])
print(f"Reshape: {t}")

Getting Data

In the tensorflow they got the data via csv files

# Load dataset.
dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv') # training data
dfeval = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/eval.csv') # testing data
y_train = dftrain.pop('survived')
y_eval = dfeval.pop('survived')

Quite liked

# Gives count,mean,std, min,max
dftrain.describe()
# Gives a histogram of age
dftrain.age.hist(bins=20)

Preparing the Data

So took a while to get Tensorflow to run (a week). Finally got it working. The first example I used, used pandas. It split out the categorized data for the numerical data. Here is a sample

Creating Preprocessing Layer for Categories

Spent a lot of time on this, mainly to see how it related to the PyTorch approach and to understand a little bit about embedding. The PyTorch approach was a great lesson on how important it is to understand the shape.

# List the categorical and numeric columns
CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']
NUMERIC_COLUMNS = ['age', 'fare']

ALL_COLUMNS = CATEGORICAL_COLUMNS + NUMERIC_COLUMNS
preprocessing_layers: Dict[str, keras.layers.Layer] = {}
# We'll store vocab sizes and embedding sizes for categorical features
vocab_sizes: Dict[str, int] = {}
embedding_dims: Dict[str, int] = {}

for feature_name in CATEGORICAL_COLUMNS:
    # treat categorical features as strings (fillna -> empty string)
    series = dftrain[feature_name].fillna('').astype(str)
    vocab = sorted(series.unique().tolist())
    # Use integer indices (output_mode='int') for efficient embedding lookup
    layer = keras.layers.StringLookup(vocabulary=vocab, output_mode='int', mask_token=None)
    preprocessing_layers[feature_name] = layer
    # reserve +1 for OOV / mask indices
    vocab_sizes[feature_name] = len(vocab) + 1
    # choose a small embedding dimension (tunable)
    v = vocab_sizes[feature_name]
    # ** means power, here we use the 4th root of vocab size
    embedding_dims[feature_name] = min(8, max(2, floor(v ** 0.25)))
    print(f'Categorical feature "{feature_name}" vocab_size={vocab_sizes[feature_name]}, embed_dim={embedding_dims[feature_name]}')

for feature_name in NUMERIC_COLUMNS:
    series = dftrain[feature_name].dropna()
    numeric_layer = keras.layers.Normalization()

    # Two approaches for the same thing
    a = np.asarray(series, dtype=np.float32).reshape(-1, 1)
    b = tf.expand_dims(series.astype('float32').values, axis=-1)
    numeric_layer.adapt(a)
    
    preprocessing_layers[feature_name] = numeric_layer
    print(f'Numeric feature "{feature_name}" Normalization layer created')


# Show the preprocessing layers mapping
for k, v in preprocessing_layers.items():
    print(k, '->', v)

# Show vocab sizes and embedding dimensions for categorical features
for feature_name in CATEGORICAL_COLUMNS:
    print(f'Categorical feature "{feature_name}" vocab_size={vocab_sizes[feature_name]}, embed_dim={embedding_dims[feature_name]}')

Creating Model Inputs

For the model we need to tell the model the structure of the input and the data. Still not too proud of this code but here it is.

def handle_string_lookup(x: tf.Tensor, layer: keras.Layer, feature_name: str) -> tf.Tensor :
    idx = layer(x)
    vocab_size = vocab_sizes[feature_name]
    emb_dim = embedding_dims[feature_name]
    emb = keras.layers.Embedding(input_dim=vocab_size, output_dim=emb_dim)(idx)
    return keras.layers.Reshape((emb_dim,))(emb)

def handle_integer_lookup(x: tf.Tensor, layer: keras.Layer, feature_name: str) -> tf.Tensor:
    idx = layer(x)
    vocab_size = vocab_sizes.get(feature_name, None)
    emb_dim = embedding_dims.get(feature_name, 4)
    emb = keras.layers.Embedding(input_dim=vocab_size, output_dim=emb_dim)(idx)
    return keras.layers.Reshape((emb_dim,))(emb)

def handle_normalization(x: tf.Tensor, layer: keras.Layer, feature_name: str) -> tf.Tensor:
    return layer(x)

def handle_default(x: tf.Tensor, layer: keras.Layer, feature_name: str) -> tf.Tensor:
    return layer(x)

dispatch = {
    keras.layers.StringLookup: handle_string_lookup,
    keras.layers.IntegerLookup: handle_integer_lookup,
    keras.layers.Normalization: handle_normalization,
}


encoded_features: List[tf.Tensor] = []
inputs: dict[str, tf.Tensor] = {}

for feature_name in CATEGORICAL_COLUMNS + NUMERIC_COLUMNS:
    
    layer = preprocessing_layers[feature_name]

    # Determine the input dtype based on the preprocessing layer type
    if isinstance(layer, keras.layers.StringLookup):
        inp_dtype = tf.string
    elif isinstance(layer, keras.layers.IntegerLookup):
        inp_dtype = tf.int32
    elif isinstance(layer, keras.layers.Normalization):
        inp_dtype = tf.float32
    else:
        inp_dtype = tf.string

    # Create a Keras Input for this feature
    inp: tf.Tensor = cast(tf.Tensor, keras.Input(shape=(1,), name=feature_name, dtype=inp_dtype))
    x = inp

    handler = dispatch.get(type(layer), handle_default)
    encoded = handler(x, layer, feature_name)
    encoded_features.append(encoded)

    # Set the input for this feature name
    inputs[feature_name] = inp

Building the Model

This was the easy bit.

  • Flattening all encoded features into a single concatenated tensor
  • We a using activation sigmoid, which in PyTorch is used to convert logits from the model
  • Optimizer used are using adam but I am guessing we could equally used torch.optim.SGD
  • For the loss function this is the same as nn.BCEWithLogitsLoss()
  • Metrics are automated here whereas PyTorch we manually calculated them
  • The kernel_regularizer=keras.regularizers.l2(1e-4) is the equivalent of weight decay in Torch
# Concatenate processed features
all_features = keras.layers.concatenate(encoded_features)

# Linear model (logistic regression): a single unit with sigmoid activation
output = keras.layers.Dense(1, activation='sigmoid', kernel_regularizer=keras.regularizers.l2(1e-4))(all_features)
model = keras.Model(inputs=inputs, outputs=output)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()

This produces the following output. We can see there are 7 InputLayers for the CATEGORICAL_COLUMNS and two for the NUMERIC_COLUMNS.

Layer (type) Output Shape Param # Connected to
sex (InputLayer) (None, 1) 0 -
n_siblings_spouses (InputLayer) (None, 1) 0 -
parch (InputLayer) (None, 1) 0 -
class (InputLayer) (None, 1) 0 -
deck (InputLayer) (None, 1) 0 -
embark_town (InputLayer) (None, 1) 0 -
alone (InputLayer) (None, 1) 0 -
string_lookup (StringLookup) (None, 1) 0 sex[0][0]
string_lookup_1 (StringLookup) (None, 1) 0 n_siblings_spouses[0][0]
string_lookup_2 (StringLookup) (None, 1) 0 parch[0][0]
string_lookup_3 (StringLookup) (None, 1) 0 class[0][0]
string_lookup_4 (StringLookup) (None, 1) 0 deck[0][0]
string_lookup_5 (StringLookup) (None, 1) 0 embark_town[0][0]
string_lookup_6 (StringLookup) (None, 1) 0 alone[0][0]
embedding (Embedding) (None, 1, 2) 6 string_lookup[0][0]
embedding_1 (Embedding) (None, 1, 2) 16 string_lookup_1[0][0]
embedding_2 (Embedding) (None, 1, 2) 14 string_lookup_2[0][0]
embedding_3 (Embedding) (None, 1, 2) 8 string_lookup_3[0][0]
embedding_4 (Embedding) (None, 1, 2) 18 string_lookup_4[0][0]
embedding_5 (Embedding) (None, 1, 2) 10 string_lookup_5[0][0]
embedding_6 (Embedding) (None, 1, 2) 6 string_lookup_6[0][0]
age (InputLayer) (None, 1) 0 -
fare (InputLayer) (None, 1) 0 -
reshape (Reshape) (None, 2) 0 embedding[0][0]
reshape_1 (Reshape) (None, 2) 0 embedding_1[0][0]
reshape_2 (Reshape) (None, 2) 0 embedding_2[0][0]
reshape_3 (Reshape) (None, 2) 0 embedding_3[0][0]
reshape_4 (Reshape) (None, 2) 0 embedding_4[0][0]
reshape_5 (Reshape) (None, 2) 0 embedding_5[0][0]
reshape_6 (Reshape) (None, 2) 0 embedding_6[0][0]
normalization (Normalization) (None, 1) 3 age[0][0]
normalization_1 (Normalization) (None, 1) 3 fare[0][0]
concatenate (Concatenate) (None, 16) 0 reshape[0][0], reshape_1[0][0], reshape_2[0][0], reshape_3[0][0], reshape_4[0][0], reshape_5[0][0], reshape_6[0][0], normalization[0][0], normalization_1[0][0]
dense (Dense) (None, 1) 17 concatenate[0][0]

I had an issue with invalid shape for the numerical columns. Looking at this table you can see for the age column it says

normalization (Normalization) || (None, 1) || 3 || age[0][0]

This is saying

[TensorFlow Shape: None]
- None = flexible batch size
- (None, 1) = any number of samples, each with 1 feature
- (627, 1) fits perfectly → 627 samples, 1 feature each

I struggled a lot for this but in the end (None,1) means 2D array not 1.

Convert Training/Evaluation Data from DataFrame to Dict

Easy pezzy lemon squeezy. Nothing to hard here

def df_to_dict(df) -> Dict[str, np.ndarray]:
    d: Dict[str, np.ndarray] = {}
    for name in CATEGORICAL_COLUMNS + NUMERIC_COLUMNS:
        vals = df[name].fillna('').values
        layer = preprocessing_layers[name]
        if isinstance(layer, keras.layers.Normalization):
            d[name] = vals.astype('float32').reshape(-1, 1)
        elif isinstance(layer, keras.layers.StringLookup):
            d[name] = vals.astype('str').reshape(-1, 1)
        elif isinstance(layer, keras.layers.IntegerLookup):
            d[name] = vals.astype('int32').reshape(-1, 1)
        else:
            d[name] = vals.astype(object).reshape(-1, 1)
    return d

train_dict = df_to_dict(dftrain)
eval_dict = df_to_dict(dfeval)

Run the Model

Spent a lot of time on the other part so have not really looked at this much.

verbose1: Any = 1
verbose2: Any = 2


BATCH_SIZE = 32
EPOCHS = 20
SEED = 42  # set for reproducible shuffling

shuffle_buffer = len(dftrain)  # full shuffle (fits in memory for this dataset)

# Ensure labels are plain numpy arrays of a fixed numeric dtype to avoid unknown-shape issues.
y_train_np = y_train.astype('int32').values
y_eval_np = y_eval.astype('int32').values

# Create tf.data datasets from the dicts train_dict, eval_dict
train_ds = tf.data.Dataset.from_tensor_slices((train_dict, y_train_np))
# Shuffle because data is sometimes ordered, batch, and prefetch
train_ds = train_ds.shuffle(shuffle_buffer, seed=SEED).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)

# We don't shuffle eval data
eval_ds = tf.data.Dataset.from_tensor_slices((eval_dict, y_eval_np)).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)

steps_per_epoch = math.ceil(len(dftrain) / BATCH_SIZE)
eval_steps = math.ceil(len(dfeval) / BATCH_SIZE)

# Set global seeds for reproducibility
tf.random.set_seed(SEED)
np.random.seed(SEED)

# Callbacks: EarlyStopping + ModelCheckpoint (save best)
callbacks = [
    keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True),
    # Use the native Keras format (.keras) instead of HDF5 to avoid legacy-format warnings
    keras.callbacks.ModelCheckpoint('best_model.keras', monitor='val_loss', save_best_only=True)
]

history = model.fit(
    train_ds,
    epochs=EPOCHS,
    steps_per_epoch=steps_per_epoch,
    validation_data=eval_ds,
    validation_steps=eval_steps,
    callbacks=callbacks,
    verbose=verbose1,
)

Comparing Tensorflow with PyTorch

In tensorflow there is not training and testing loops. This is all done a little bit differently.

Feature TensorFlow model.fit() PyTorch Manual Loop
Training Automatically trains over epochs You write for epoch in range(...) manually
Evaluation Can include validation_data for per-epoch eval You manually run model.eval() and compute metrics
Batching Uses batch_size internally You use DataLoader to batch manually
Metrics Tracks loss, accuracy, etc. automatically You compute and log metrics manually
Callbacks Supports early stopping, checkpoints, etc. You implement these manually or use libraries
Progress Built-in progress bar and logs You print or log manually
Custom Logic Harder to inject per-batch logic Full control over every step

Clustering

Clustering a way to put data points into k-groups and sort them. It does this by

  • Choose how many clusters (K)

Decide how many centroids you want — e.g., K = 3

  • Randomly place centroids

Pick K random points from your dataset as initial centroids

  • Assign each point to nearest centroid

Use Euclidean distance to group points by closest centroid

  • Update centroids

Recalculate each centroid as the mean of its assigned points (Center of Mass)

  • Repeat until stable

Keep reassigning and updating until centroids stop moving
Euclidean distance is the distance between two points. There are others. e.g. Manhattan, or Mahalanobis

Hidden Markov Model

For this model we need

  • States
  • Transition Distributions
  • Observation Distributions

To demonstrate this they shows an example where the Sunshine and Rain were the states, probability of going from one to the other was Transition Distributions and the collected data when in that state was the observation distribution.

Creating a model is really easy. Steps is the number of days to predict for.

import tensorflow_probability as tfp  # We are using a different module from tensorflow this time
import tensorflow as t

tfd = tfp.distributions  # making a shortcut for later on
initial_distribution = tfd.Categorical(probs=[0.2, 0.8])  # Refer to point 2 above
transition_distribution = tfd.Categorical(probs=[[0.5, 0.5],
                                                 [0.2, 0.8]])  # refer to points 3 and 4 above
observation_distribution = tfd.Normal(loc=[0., 15.], scale=[5., 10.])  # refer to point 5 above

# the loc argument represents the mean and the scale is the standard devitation

model = tfd.HiddenMarkovModel(
    initial_distribution=initial_distribution,
    transition_distribution=transition_distribution,
    observation_distribution=observation_distribution,
    num_steps=7)

Getting predications out is done like this.

mean = model.mean()

# due to the way TensorFlow works on a lower level we need to evaluate part of the graph
# from within a session to see the value of this tensor

# in the new version of tensorflow we need to use tf.compat.v1.Session() rather than just tf.Session()
with tf.compat.v1.Session() as sess:  
  print(mean.numpy())

Gives the following 7 temperatures.

[12.       11.1      10.83     10.748999 10.724699 10.71741  10.715222]

Convolutional Neural Networks (CNN)

This was remarkably, well maybe not, the same as the PyTorch. At lot of the time was spent explaining how things like border and padding works and pooling. I thoroughly recommend the site given on the PyTorch course poloclub. I have checked in examples into my Gitea repository. One thing that was also mentioned was Data Augmentation which is of course making more data by rotating, flipping etc.

from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator

# creates a data generator object that transforms images
datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')

# pick an image to transform
test_img = train_images[20]
img = image.img_to_array(test_img)  # convert image to numpy arry
img = img.reshape((1,) + img.shape)  # reshape image

i = 0

for batch in datagen.flow(img, save_prefix='test', save_format='jpeg'):  # this loops runs forever until we break, saving images to current directory with specified prefix
    plt.figure(i)
    plot = plt.imshow(image.img_to_array(batch[0]))
    i += 1
    if i > 4:  # show 4 images
        break

plt.show()
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator

# creates a data generator object that transforms images
datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')

# pick an image to transform
test_img = train_images[20]
img = image.img_to_array(test_img)  # convert image to numpy arry
img = img.reshape((1,) + img.shape)  # reshape image

i = 0

for batch in datagen.flow(img, save_prefix='test', save_format='jpeg'):  # this loops runs forever until we break, saving images to current directory with specified prefix
    plt.figure(i)
    plot = plt.imshow(image.img_to_array(batch[0]))
    i += 1
    if i > 4:  # show 4 images
        break

plt.show()

Buidling a model is not that different from PyTorch either.

# Create the CNN Base First
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

# Add Dense Layers
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))

# Compile and add Loss and Optimizer and Metrics
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

# And Fit the data
history = model.fit(train_images, train_labels, epochs=10, 
                    validation_data=(test_images, test_labels))

Using Pretrained Model

This is probably the most interesting part of the course. Hopefully Kaggle will be compatible. Apparently, getting the data approach is different for different dataset so you need to read the docs. I thought tensorflow_dataset was tied to tensorflow and having to build my own might be a problem but it wasn't.

Picking a Model the, I am guessing, is the difficult bit. I was given the model MobileNet V2 which you can read about here. I think the main thing about it is it using reverse residual blocks and think this means instead of adding the input to convolutional layer to the output, we subtract it.

Guess the most important part of the process when using a pretrained model is freezing it.

IMG_SHAPE = (IMG_SIZE, IMG_SIZE, 3)

# Create the base model from the pre-trained model MobileNet V2
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
                                               include_top=False,
                                               weights='imagenet')

# Stop the original model being trained
base_model.trainable = False

# Show the model
base_model.summary()

The output of the summary shows

...
Trainable params: 0 (0.00 B)

Next we create our layers.

 
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
prediction_layer = keras.layers.Dense(1)

The global_average_layer converts our input to average

(batch, HxWxChannel) => (batch, C)

Where C is the average of HxW

The Dense layer attaches one neuron to the output as be are doing binary classification in this case. If it was multi-classification it would be Denise(len(class_names)). Now we can build our model. Nothing to see here

 
model = keras.Sequential([
  base_model,
  global_average_layer,
  prediction_layer
])


base_learning_rate = 0.0001
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=base_learning_rate), # type: ignore
              loss=keras.losses.BinaryCrossentropy(from_logits=True),
              metrics=['accuracy'])