22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Attributes

The Dataset has many attributes, like features, num_columns, and shape:

dataset.features, dataset.num_columns, dataset.shape

Output

({'sentence': Value(dtype='string', id=None),

'source': Value(dtype='string', id=None)},

2,

(3852, 2))

Our dataset has two columns, sentence and source, and there are 3,852 sentences

in it.

It can be indexed like a list:

dataset[2]

Output

{'sentence': 'There was nothing so VERY remarkable in that; nor did

Alice think it so VERY much out of the way to hear the Rabbit say to

itself, `Oh dear!',

'source': 'alice28-1476.txt'}

That’s the third sentence in our dataset, and it is from Alice’s Adventures in

Wonderland.

And its columns can be accessed as a dictionary too:

dataset['source'][:3]

Output

['alice28-1476.txt', 'alice28-1476.txt', 'alice28-1476.txt']

Building a Dataset | 893

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!