1 0x_id
j edited this page 2023-07-02 13:08:57 +05:30

unique id for a movie

idea: rather hash title/director/year than create a random id

'0x%s' % hashlib.sha1('\t'.join('0x', title, director, year)).hexdigest()

because this way, it is evident what the id references (the movie "Vertigo" by Alfred Hitchcock, released in 1958)

otherwise, the id would reference something that, at one point in time, is "Vertigo", but may, later (in case we allow changes to the database, which we will), be "The Matrix"

in other words: rather derive the id from unique core metadata, and change the id if that metadata changes, than keep a constant id for something that itself is vague

title, director and year would need to be defined (original title, ', '.join('Firstname Lastname'), []beginYear-releaseYear)

title may need an additional identifier to be unique (music: 'Rio []Album' vs. 'Rio []Single'), some asian names are 'Lastname Firstname', unicode may be an issue, etc.

in the database, a movie would have a unique primary key that is different from the id. if the key is just incremental, then it should not be exposed (privacy, profiling).

question: is this a good idea?