TensorFlow Course
Introduction
This is a consequence of doing the PyTorch course. I like to compare to a) re-enforce concepts b) have a choice to how to approach work. I chose freeCodeCamp this time from here. Before the middle of the course I changed direction and instead create ipynb to compare Tensor flow with PyTorch. This re-enforced my learning and can be found on my git site, tensorflow and pyTorch
Building for Source
My Software vs Required Software
Have a lot of trouble with CUDA and WSL. It suggested using 12.5.1 which I downloaded and still did not work. This is what I had and needed according to the robot
| Component | Your Version | Required for TF 2.20.0 |
|---|---|---|
| Python | 3.12.3 | 3.9–3.13 |
| Clang | 22.0.0 | 18.1.8 |
| Bazel | None | 7.4.1 |
| CUDA Toolkit | 13.0 | 12.5 |
| NVIDIA Driver | 581.42 | ≥ 535.54.03 |
| cuDNN | 9.14.0 | 9.3 |
Software Used
For me I used the following Cuda and CuDNN
cuda-repo-wsl-ubuntu-13-0-local_13.0.2-1_amd64.deb cudnn-local-repo-ubuntu2404-9.14.0_1.0-1_amd64.deb clang-22
bashrc changes
Having install the above I changed the following
export PATH=/usr/local/cuda-13.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-13.0/lib64:$LD_LIBRARY_PATH
Environemnt Variables
Here are the ones I used
export TF_CUDA_VERSION=13.0
export TF_CUDNN_VERSION=9.14
export TF_CUDA_COMPUTE_CAPABILITIES=12.0
export CUDA_TOOLKIT_PATH=/usr/local/cuda-13.0
export CUDNN_INSTALL_PATH=/usr/lib/x86_64-linux-gnu
export TF_ENABLE_ONEDNN_OPTS=1
export TF_NEED_CUDA=1
export TF_NEED_TENSORRT=0
export TF_NEED_ROCM=0
export TF_NEED_OPENCL_SYCL=0
export TF_NEED_MPI=0
export TF_NEED_NCCL=0
export TF_SET_ANDROID_WORKSPACE=0
export TF_CUDA_CLANG=1
export CLANG_CUDA_COMPILER_PATH=/usr/bin/clang
export GCC_HOST_COMPILER_PATH=/usr/bin/gcc
export CC_OPT_FLAGS="-march=native -O3"
export TF_CONFIGURE_IOS=0
# Optional: use your local JSON compatibility matrix
export TF_CUDA_VERSION_JSON_URL="http://localhost/cuda_13.0.json"
export CLANG_CUDA_COMPILER_PATH=/usr/bin/clang-22
Json File
I set up a json file in caddy containing
{
"cuda_version": "13.0",
"cudnn_version": "9.14.0",
"compute_capabilities": ["12.0"],
"cuda_paths": {
"cuda_home": "/usr/local/cuda-13.0",
"cuda_bin": "/usr/local/cuda-13.0/bin",
"cuda_lib": "/usr/local/cuda-13.0/lib64",
"cudnn_lib": "/usr/lib/x86_64-linux-gnu",
"cudnn_include": "/usr/include"
},
"env": {
"PATH": "/usr/local/cuda-13.0/bin:$PATH",
"LD_LIBRARY_PATH": "/usr/local/cuda-13.0/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH",
"CUDA_HOME": "/usr/local/cuda-13.0",
"TF_CUDA_VERSION": "13.0",
"TF_CUDNN_VERSION": "9.14",
"TF_CUDA_COMPUTE_CAPABILITIES": "12.0",
"TF_ENABLE_ONEDNN_OPTS": "1"
},
"bazel_flags": [
"--config=cuda",
"--copt=-DGOOGLE_CUDA=1",
"--copt=-DTF_CUDA_COMPUTE_CAPABILITIES=12.0",
"--copt=-O3",
"--copt=-march=native",
"--define=using_cuda=true",
"--define=using_cuda_nvcc=true"
],
"notes": [
"Ensure that cudnn_version.h confirms 9.14.0",
"Verify that texture_fetch_functions.h is present in /usr/local/cuda-13.0/include/crt/",
"If using Clang toolchain, ensure CUDA headers are explicitly included via --copt",
"Consider switching to GCC if Clang fails to resolve CUDA device headers"
]
}
You can test with
curl http://localhost/cuda_13.0.json
In my case it no longer worked due to systemctl locking up so I am unable to know if this was used but the contents are here.
Configure
For the ./configure I hit return on the questions except Please specify a list of comma-separated CUDA compute capabilities you want to build with which I answered 12.0. I am told that options are written to .tf_configure.bazelrc so I provide mine here
build --action_env PYTHON_BIN_PATH="/usr/bin/python3"
build --action_env PYTHON_LIB_PATH="/usr/lib/python3/dist-packages"
build --python_path="/usr/bin/python3"
build:cuda --repo_env HERMETIC_CUDA_COMPUTE_CAPABILITIES="12.0"
build --action_env LD_LIBRARY_PATH="/usr/local/cuda-13.0/lib64:"
build --config=cuda_clang
build --action_env CLANG_CUDA_COMPILER_PATH="/usr/lib/llvm-22/bin/clang"
build --config=cuda_clang
build:opt --copt=-march=native
build:opt --host_copt=-march=native
build:opt --copt=-O3
build:opt --host_copt=-O3
test --test_size_filters=small,medium
test --test_env=LD_LIBRARY_PATH
test:v1 --test_tag_filters=-benchmark-test,-no_oss,-oss_excluded,-no_gpu,-oss_serial
test:v1 --build_tag_filters=-benchmark-test,-no_oss,-oss_excluded,-no_gpu
test:v2 --test_tag_filters=-benchmark-test,-no_oss,-oss_excluded,-no_gpu,-oss_serial,-v1only
test:v2 --build_tag_filters=-benchmark-test,-no_oss,-oss_excluded,-no_gpu,-v1only
Building
Then I was able to run this which takes forever 1-2 hours guessing
bazel build --config=cuda_wheel //tensorflow/tools/pip_package:wheel
Getting Started
More Maths
There is no end to maths. Each time I think I am fairly au fait with it, along comes more. So this time fourth root. So we, even I, know what a square root is
Square root of 4 = 2 or 2^2 = 4 Square root of 9 = 3 or 3^2 = 9
Now lets look at the fourth root. What number when raised by the power of 4 = 100. It turns out that it is 3.1622776601683793319988935444327. i.e. 3.1622776601683793319988935444327 ^ 4 = 100 and therefore known as the fourth root. Now the reason I am interested in this is because we calculate embedding_dims using
min(8, max(2, floor(v ** 0.25)))
So when v = 100 we have
= min(8, max(2, floor(v^0.25))) = min(8, max(2, floor(3.16))) = min(8, max(2, 3) = min(8, 3) = 3
Digging deeper the robot used the word heuristic which does not mean much to me but appears to mean something someone found worked.
Embeddings
So this is just a way to turn data into Tensors
# Step 1: Define your city vocabulary
cities = ['London', 'Glasgow', 'Dunedin']
city_input = tf.constant(['London', 'Glasgow', 'Dunedin', 'Unknown'])
# Step 2: Create a StringLookup layer
lookup = tf.keras.layers.StringLookup(vocabulary=cities, output_mode='int', mask_token=None)
# Step 3: Convert city names to integer indices
city_indices = lookup(city_input)
print("City indices:", city_indices.numpy())
# Step 4: Create an Embedding layer
# input_dim = vocab size + 1 for OOV
embedding_dim = 3
embedding = tf.keras.layers.Embedding(input_dim=len(cities) + 1, output_dim=embedding_dim)
# Step 5: Map indices to embedding vectors
city_vectors = embedding(city_indices)
print("Embedding vectors:\n", city_vectors.numpy())
This produces
City indices: [1 2 3 0] Embedding vectors: [[-0.01283479 0.02066988 0.00210273] [-0.02549194 0.02811767 0.04098605] [ 0.01020243 -0.0374084 0.00962119] [-0.04021025 -0.01305205 -0.04124854]]
Some Terminology
- Rank is the same as dim
- eval is the same as test
Tensors
Much like rand we can make tensors
# Make 5x5
t = tf.zeros([5,5])
# No different to pyTorch
print(f"Shape: {t.shape}")
print(f"Device: {t.device}")
print(f"Device: {t.dtype}")
# Reshape
t = tf.reshape(t, [1,25])
print(f"Reshape: {t}")
# Reshape
t = tf.reshape(t, [1,1,25])
print(f"Reshape: {t}")
Getting Data
In the tensorflow they got the data via csv files
# Load dataset.
dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv') # training data
dfeval = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/eval.csv') # testing data
y_train = dftrain.pop('survived')
y_eval = dfeval.pop('survived')
Quite liked
# Gives count,mean,std, min,max
dftrain.describe()
# Gives a histogram of age
dftrain.age.hist(bins=20)
Preparing the Data
So took a while to get Tensorflow to run (a week). Finally got it working. The first example I used, used pandas. It split out the categorized data for the numerical data. Here is a sample
![]()
Creating Preprocessing Layer for Categories
Spent a lot of time on this, mainly to see how it related to the PyTorch approach and to understand a little bit about embedding. The PyTorch approach was a great lesson on how important it is to understand the shape.
# List the categorical and numeric columns
CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']
NUMERIC_COLUMNS = ['age', 'fare']
ALL_COLUMNS = CATEGORICAL_COLUMNS + NUMERIC_COLUMNS
preprocessing_layers: Dict[str, keras.layers.Layer] = {}
# We'll store vocab sizes and embedding sizes for categorical features
vocab_sizes: Dict[str, int] = {}
embedding_dims: Dict[str, int] = {}
for feature_name in CATEGORICAL_COLUMNS:
# treat categorical features as strings (fillna -> empty string)
series = dftrain[feature_name].fillna('').astype(str)
vocab = sorted(series.unique().tolist())
# Use integer indices (output_mode='int') for efficient embedding lookup
layer = keras.layers.StringLookup(vocabulary=vocab, output_mode='int', mask_token=None)
preprocessing_layers[feature_name] = layer
# reserve +1 for OOV / mask indices
vocab_sizes[feature_name] = len(vocab) + 1
# choose a small embedding dimension (tunable)
v = vocab_sizes[feature_name]
# ** means power, here we use the 4th root of vocab size
embedding_dims[feature_name] = min(8, max(2, floor(v ** 0.25)))
print(f'Categorical feature "{feature_name}" vocab_size={vocab_sizes[feature_name]}, embed_dim={embedding_dims[feature_name]}')
for feature_name in NUMERIC_COLUMNS:
series = dftrain[feature_name].dropna()
numeric_layer = keras.layers.Normalization()
# Two approaches for the same thing
a = np.asarray(series, dtype=np.float32).reshape(-1, 1)
b = tf.expand_dims(series.astype('float32').values, axis=-1)
numeric_layer.adapt(a)
preprocessing_layers[feature_name] = numeric_layer
print(f'Numeric feature "{feature_name}" Normalization layer created')
# Show the preprocessing layers mapping
for k, v in preprocessing_layers.items():
print(k, '->', v)
# Show vocab sizes and embedding dimensions for categorical features
for feature_name in CATEGORICAL_COLUMNS:
print(f'Categorical feature "{feature_name}" vocab_size={vocab_sizes[feature_name]}, embed_dim={embedding_dims[feature_name]}')
Creating Model Inputs
For the model we need to tell the model the structure of the input and the data. Still not too proud of this code but here it is.
def handle_string_lookup(x: tf.Tensor, layer: keras.Layer, feature_name: str) -> tf.Tensor :
idx = layer(x)
vocab_size = vocab_sizes[feature_name]
emb_dim = embedding_dims[feature_name]
emb = keras.layers.Embedding(input_dim=vocab_size, output_dim=emb_dim)(idx)
return keras.layers.Reshape((emb_dim,))(emb)
def handle_integer_lookup(x: tf.Tensor, layer: keras.Layer, feature_name: str) -> tf.Tensor:
idx = layer(x)
vocab_size = vocab_sizes.get(feature_name, None)
emb_dim = embedding_dims.get(feature_name, 4)
emb = keras.layers.Embedding(input_dim=vocab_size, output_dim=emb_dim)(idx)
return keras.layers.Reshape((emb_dim,))(emb)
def handle_normalization(x: tf.Tensor, layer: keras.Layer, feature_name: str) -> tf.Tensor:
return layer(x)
def handle_default(x: tf.Tensor, layer: keras.Layer, feature_name: str) -> tf.Tensor:
return layer(x)
dispatch = {
keras.layers.StringLookup: handle_string_lookup,
keras.layers.IntegerLookup: handle_integer_lookup,
keras.layers.Normalization: handle_normalization,
}
encoded_features: List[tf.Tensor] = []
inputs: dict[str, tf.Tensor] = {}
for feature_name in CATEGORICAL_COLUMNS + NUMERIC_COLUMNS:
layer = preprocessing_layers[feature_name]
# Determine the input dtype based on the preprocessing layer type
if isinstance(layer, keras.layers.StringLookup):
inp_dtype = tf.string
elif isinstance(layer, keras.layers.IntegerLookup):
inp_dtype = tf.int32
elif isinstance(layer, keras.layers.Normalization):
inp_dtype = tf.float32
else:
inp_dtype = tf.string
# Create a Keras Input for this feature
inp: tf.Tensor = cast(tf.Tensor, keras.Input(shape=(1,), name=feature_name, dtype=inp_dtype))
x = inp
handler = dispatch.get(type(layer), handle_default)
encoded = handler(x, layer, feature_name)
encoded_features.append(encoded)
# Set the input for this feature name
inputs[feature_name] = inp
Building the Model
This was the easy bit.
- Flattening all encoded features into a single concatenated tensor
- We a using activation sigmoid, which in PyTorch is used to convert logits from the model
- Optimizer used are using adam but I am guessing we could equally used torch.optim.SGD
- For the loss function this is the same as nn.BCEWithLogitsLoss()
- Metrics are automated here whereas PyTorch we manually calculated them
- The kernel_regularizer=keras.regularizers.l2(1e-4) is the equivalent of weight decay in Torch
# Concatenate processed features
all_features = keras.layers.concatenate(encoded_features)
# Linear model (logistic regression): a single unit with sigmoid activation
output = keras.layers.Dense(1, activation='sigmoid', kernel_regularizer=keras.regularizers.l2(1e-4))(all_features)
model = keras.Model(inputs=inputs, outputs=output)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
This produces the following output. We can see there are 7 InputLayers for the CATEGORICAL_COLUMNS and two for the NUMERIC_COLUMNS.
| Layer (type) | Output Shape | Param # | Connected to |
|---|---|---|---|
| sex (InputLayer) | (None, 1) | 0 | - |
| n_siblings_spouses (InputLayer) | (None, 1) | 0 | - |
| parch (InputLayer) | (None, 1) | 0 | - |
| class (InputLayer) | (None, 1) | 0 | - |
| deck (InputLayer) | (None, 1) | 0 | - |
| embark_town (InputLayer) | (None, 1) | 0 | - |
| alone (InputLayer) | (None, 1) | 0 | - |
| string_lookup (StringLookup) | (None, 1) | 0 | sex[0][0] |
| string_lookup_1 (StringLookup) | (None, 1) | 0 | n_siblings_spouses[0][0] |
| string_lookup_2 (StringLookup) | (None, 1) | 0 | parch[0][0] |
| string_lookup_3 (StringLookup) | (None, 1) | 0 | class[0][0] |
| string_lookup_4 (StringLookup) | (None, 1) | 0 | deck[0][0] |
| string_lookup_5 (StringLookup) | (None, 1) | 0 | embark_town[0][0] |
| string_lookup_6 (StringLookup) | (None, 1) | 0 | alone[0][0] |
| embedding (Embedding) | (None, 1, 2) | 6 | string_lookup[0][0] |
| embedding_1 (Embedding) | (None, 1, 2) | 16 | string_lookup_1[0][0] |
| embedding_2 (Embedding) | (None, 1, 2) | 14 | string_lookup_2[0][0] |
| embedding_3 (Embedding) | (None, 1, 2) | 8 | string_lookup_3[0][0] |
| embedding_4 (Embedding) | (None, 1, 2) | 18 | string_lookup_4[0][0] |
| embedding_5 (Embedding) | (None, 1, 2) | 10 | string_lookup_5[0][0] |
| embedding_6 (Embedding) | (None, 1, 2) | 6 | string_lookup_6[0][0] |
| age (InputLayer) | (None, 1) | 0 | - |
| fare (InputLayer) | (None, 1) | 0 | - |
| reshape (Reshape) | (None, 2) | 0 | embedding[0][0] |
| reshape_1 (Reshape) | (None, 2) | 0 | embedding_1[0][0] |
| reshape_2 (Reshape) | (None, 2) | 0 | embedding_2[0][0] |
| reshape_3 (Reshape) | (None, 2) | 0 | embedding_3[0][0] |
| reshape_4 (Reshape) | (None, 2) | 0 | embedding_4[0][0] |
| reshape_5 (Reshape) | (None, 2) | 0 | embedding_5[0][0] |
| reshape_6 (Reshape) | (None, 2) | 0 | embedding_6[0][0] |
| normalization (Normalization) | (None, 1) | 3 | age[0][0] |
| normalization_1 (Normalization) | (None, 1) | 3 | fare[0][0] |
| concatenate (Concatenate) | (None, 16) | 0 | reshape[0][0], reshape_1[0][0], reshape_2[0][0], reshape_3[0][0], reshape_4[0][0], reshape_5[0][0], reshape_6[0][0], normalization[0][0], normalization_1[0][0] |
| dense (Dense) | (None, 1) | 17 | concatenate[0][0] |
I had an issue with invalid shape for the numerical columns. Looking at this table you can see for the age column it says
normalization (Normalization) || (None, 1) || 3 || age[0][0]
This is saying
[TensorFlow Shape: None] - None = flexible batch size - (None, 1) = any number of samples, each with 1 feature - (627, 1) fits perfectly → 627 samples, 1 feature each
I struggled a lot for this but in the end (None,1) means 2D array not 1.
Convert Training/Evaluation Data from DataFrame to Dict
Easy pezzy lemon squeezy. Nothing to hard here
def df_to_dict(df) -> Dict[str, np.ndarray]:
d: Dict[str, np.ndarray] = {}
for name in CATEGORICAL_COLUMNS + NUMERIC_COLUMNS:
vals = df[name].fillna('').values
layer = preprocessing_layers[name]
if isinstance(layer, keras.layers.Normalization):
d[name] = vals.astype('float32').reshape(-1, 1)
elif isinstance(layer, keras.layers.StringLookup):
d[name] = vals.astype('str').reshape(-1, 1)
elif isinstance(layer, keras.layers.IntegerLookup):
d[name] = vals.astype('int32').reshape(-1, 1)
else:
d[name] = vals.astype(object).reshape(-1, 1)
return d
train_dict = df_to_dict(dftrain)
eval_dict = df_to_dict(dfeval)
Run the Model
Spent a lot of time on the other part so have not really looked at this much.
verbose1: Any = 1
verbose2: Any = 2
BATCH_SIZE = 32
EPOCHS = 20
SEED = 42 # set for reproducible shuffling
shuffle_buffer = len(dftrain) # full shuffle (fits in memory for this dataset)
# Ensure labels are plain numpy arrays of a fixed numeric dtype to avoid unknown-shape issues.
y_train_np = y_train.astype('int32').values
y_eval_np = y_eval.astype('int32').values
# Create tf.data datasets from the dicts train_dict, eval_dict
train_ds = tf.data.Dataset.from_tensor_slices((train_dict, y_train_np))
# Shuffle because data is sometimes ordered, batch, and prefetch
train_ds = train_ds.shuffle(shuffle_buffer, seed=SEED).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
# We don't shuffle eval data
eval_ds = tf.data.Dataset.from_tensor_slices((eval_dict, y_eval_np)).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
steps_per_epoch = math.ceil(len(dftrain) / BATCH_SIZE)
eval_steps = math.ceil(len(dfeval) / BATCH_SIZE)
# Set global seeds for reproducibility
tf.random.set_seed(SEED)
np.random.seed(SEED)
# Callbacks: EarlyStopping + ModelCheckpoint (save best)
callbacks = [
keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True),
# Use the native Keras format (.keras) instead of HDF5 to avoid legacy-format warnings
keras.callbacks.ModelCheckpoint('best_model.keras', monitor='val_loss', save_best_only=True)
]
history = model.fit(
train_ds,
epochs=EPOCHS,
steps_per_epoch=steps_per_epoch,
validation_data=eval_ds,
validation_steps=eval_steps,
callbacks=callbacks,
verbose=verbose1,
)
Comparing Tensorflow with PyTorch
In tensorflow there is not training and testing loops. This is all done a little bit differently.
| Feature | TensorFlow model.fit() |
PyTorch Manual Loop |
|---|---|---|
| Training | Automatically trains over epochs | You write for epoch in range(...) manually
|
| Evaluation | Can include validation_data for per-epoch eval |
You manually run model.eval() and compute metrics
|
| Batching | Uses batch_size internally |
You use DataLoader to batch manually
|
| Metrics | Tracks loss, accuracy, etc. automatically | You compute and log metrics manually |
| Callbacks | Supports early stopping, checkpoints, etc. | You implement these manually or use libraries |
| Progress | Built-in progress bar and logs | You print or log manually |
| Custom Logic | Harder to inject per-batch logic | Full control over every step |
Clustering
Clustering a way to put data points into k-groups and sort them. It does this by
- Choose how many clusters (K)
Decide how many centroids you want — e.g., K = 3
- Randomly place centroids
Pick K random points from your dataset as initial centroids
- Assign each point to nearest centroid
Use Euclidean distance to group points by closest centroid
- Update centroids
Recalculate each centroid as the mean of its assigned points (Center of Mass)
- Repeat until stable
Keep reassigning and updating until centroids stop moving
Euclidean distance is the distance between two points. There are others. e.g. Manhattan, or Mahalanobis

Hidden Markov Model
For this model we need
- States
- Transition Distributions
- Observation Distributions
To demonstrate this they shows an example where the Sunshine and Rain were the states, probability of going from one to the other was Transition Distributions and the collected data when in that state was the observation distribution.
![]()
Creating a model is really easy. Steps is the number of days to predict for.
import tensorflow_probability as tfp # We are using a different module from tensorflow this time
import tensorflow as t
tfd = tfp.distributions # making a shortcut for later on
initial_distribution = tfd.Categorical(probs=[0.2, 0.8]) # Refer to point 2 above
transition_distribution = tfd.Categorical(probs=[[0.5, 0.5],
[0.2, 0.8]]) # refer to points 3 and 4 above
observation_distribution = tfd.Normal(loc=[0., 15.], scale=[5., 10.]) # refer to point 5 above
# the loc argument represents the mean and the scale is the standard devitation
model = tfd.HiddenMarkovModel(
initial_distribution=initial_distribution,
transition_distribution=transition_distribution,
observation_distribution=observation_distribution,
num_steps=7)
Getting predications out is done like this.
mean = model.mean()
# due to the way TensorFlow works on a lower level we need to evaluate part of the graph
# from within a session to see the value of this tensor
# in the new version of tensorflow we need to use tf.compat.v1.Session() rather than just tf.Session()
with tf.compat.v1.Session() as sess:
print(mean.numpy())
Gives the following 7 temperatures.
[12. 11.1 10.83 10.748999 10.724699 10.71741 10.715222]
Convolutional Neural Networks (CNN)
This was remarkably, well maybe not, the same as the PyTorch. At lot of the time was spent explaining how things like border and padding works and pooling. I thoroughly recommend the site given on the PyTorch course poloclub. I have checked in examples into my Gitea repository. One thing that was also mentioned was Data Augmentation which is of course making more data by rotating, flipping etc.
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
# creates a data generator object that transforms images
datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
# pick an image to transform
test_img = train_images[20]
img = image.img_to_array(test_img) # convert image to numpy arry
img = img.reshape((1,) + img.shape) # reshape image
i = 0
for batch in datagen.flow(img, save_prefix='test', save_format='jpeg'): # this loops runs forever until we break, saving images to current directory with specified prefix
plt.figure(i)
plot = plt.imshow(image.img_to_array(batch[0]))
i += 1
if i > 4: # show 4 images
break
plt.show()
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
# creates a data generator object that transforms images
datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
# pick an image to transform
test_img = train_images[20]
img = image.img_to_array(test_img) # convert image to numpy arry
img = img.reshape((1,) + img.shape) # reshape image
i = 0
for batch in datagen.flow(img, save_prefix='test', save_format='jpeg'): # this loops runs forever until we break, saving images to current directory with specified prefix
plt.figure(i)
plot = plt.imshow(image.img_to_array(batch[0]))
i += 1
if i > 4: # show 4 images
break
plt.show()
Buidling a model is not that different from PyTorch either.
# Create the CNN Base First
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
# Add Dense Layers
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))
# Compile and add Loss and Optimizer and Metrics
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
# And Fit the data
history = model.fit(train_images, train_labels, epochs=10,
validation_data=(test_images, test_labels))
Using Pretrained Model
This is probably the most interesting part of the course. Hopefully Kaggle will be compatible. Apparently, getting the data approach is different for different dataset so you need to read the docs. I thought tensorflow_dataset was tied to tensorflow and having to build my own might be a problem but it wasn't.
Picking a Model the, I am guessing, is the difficult bit. I was given the model MobileNet V2 which you can read about here. I think the main thing about it is it using reverse residual blocks and think this means instead of adding the input to convolutional layer to the output, we subtract it.
Guess the most important part of the process when using a pretrained model is freezing it.
IMG_SHAPE = (IMG_SIZE, IMG_SIZE, 3)
# Create the base model from the pre-trained model MobileNet V2
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
include_top=False,
weights='imagenet')
# Stop the original model being trained
base_model.trainable = False
# Show the model
base_model.summary()
The output of the summary shows
... Trainable params: 0 (0.00 B)
Next we create our layers.
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
prediction_layer = keras.layers.Dense(1)
The global_average_layer converts our input to average
(batch, HxWxChannel) => (batch, C)
Where C is the average of HxW
The Dense layer attaches one neuron to the output as be are doing binary classification in this case. If it was multi-classification it would be Denise(len(class_names)). Now we can build our model. Nothing to see here
model = keras.Sequential([
base_model,
global_average_layer,
prediction_layer
])
base_learning_rate = 0.0001
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=base_learning_rate), # type: ignore
loss=keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])