The LyricsGenerator class¶
-
class
tflyrics.
LyricsGenerator
(artists: list = [], per_artist: int = 5, vocabulary: list = None, text_provider: tflyrics.text_provider.TextProvider = None)¶ An adapter between the Genius API and a TF Dataset.
A LyricsGenerator object queries the Genius API and provides a TensorFlow Dataset that can be fed to a model for training. Each sample of the dataset is a sequence of unicode characters taken from song lyrics. More precisely, each example in a LyricsGenerator is a character sequence that maps to a sequence with same length but shifted content: for example, there could be a LyricsGenerator where “Hello wo” maps to “ello wor”, and “ello wor” maps to “llo worl”.
-
as_dataset
(batch_size: int = None, seq_length: int = 100) → tensorflow.python.data.ops.dataset_ops.DatasetV2¶ Get a TensorFlow dataset equivalent to this object.
Get a TensorFlow dataset whose samples are substrings of song lyrics provided by a Genius object. More specifically, each sample in the database is a pair of substrings that have the same size but are shifted with respect to each other: e.g. [(“Hello”, “ello “), (“ello “, “llo W”), (“llo W”, “lo Wo”), (“lo Wo”, “o Wor”), (“o Wor”, ” Worl”), …].
- Parameters
batch_size – batch size of the dataset
seq_length – length of each substring that forms a sample
- Returns
a TensorFlow Dataset of substrings
-
preprocess
¶ Split the lyrics of a song into multiple substrings.
Preprocess a string containing lyrics, extracting substrings that have a fixed, specified size from it. Return the substrings as sequences of integers rather than characters.
- Parameters
text – a string containing lyrics
seq_length – fixed size of each output sequence
- Returns
a list of substrings extracted from the song, as lists of ints
-