The LyricsGenerator class

class tflyrics.LyricsGenerator(artists: list = [], per_artist: int = 5, vocabulary: list = None, text_provider: tflyrics.text_provider.TextProvider = None)

An adapter between the Genius API and a TF Dataset.

A LyricsGenerator object queries the Genius API and provides a TensorFlow Dataset that can be fed to a model for training. Each sample of the dataset is a sequence of unicode characters taken from song lyrics. More precisely, each example in a LyricsGenerator is a character sequence that maps to a sequence with same length but shifted content: for example, there could be a LyricsGenerator where “Hello wo” maps to “ello wor”, and “ello wor” maps to “llo worl”.

as_dataset(batch_size: int = None, seq_length: int = 100) → tensorflow.python.data.ops.dataset_ops.DatasetV2

Get a TensorFlow dataset equivalent to this object.

Get a TensorFlow dataset whose samples are substrings of song lyrics provided by a Genius object. More specifically, each sample in the database is a pair of substrings that have the same size but are shifted with respect to each other: e.g. [(“Hello”, “ello “), (“ello “, “llo W”), (“llo W”, “lo Wo”), (“lo Wo”, “o Wor”), (“o Wor”, ” Worl”), …].

Parameters
  • batch_size – batch size of the dataset

  • seq_length – length of each substring that forms a sample

Returns

a TensorFlow Dataset of substrings

preprocess

Split the lyrics of a song into multiple substrings.

Preprocess a string containing lyrics, extracting substrings that have a fixed, specified size from it. Return the substrings as sequences of integers rather than characters.

Parameters
  • text – a string containing lyrics

  • seq_length – fixed size of each output sequence

Returns

a list of substrings extracted from the song, as lists of ints