Batman Hacked My Password: A Subtitle-Based Analysis of Password Depiction in Movies
Link: https://www.usenix.org/conference/soups2024/presentation/raphael
Conference: SOUPS 2024
Keywords: Usable Security, Passwords, Movies, Subtitles
Supplement: https://www.itsec.uni-hannover.de/de/usec/forschung/medien/password-depiction-in-movies#c89784
Summary
People learn knowledge from movies, influencing their perceptions about cybersecurity including heir mindset about passwords.
Data & Methods
Torrents from Reddit r/DataHoarder. The torrent contains a database (opensubs.db, 136.8 GB) of 5,719,123 subtitle files, crawled on July 24, 2022. It also contains a metadata file (subtitles_all.txt.gz, 309 MB) that includes information such as movie name, year, language, content type (movie, TV show), season, episode, IDs (IMDB, OpenSubti- tle), upload date, frame rate, and file format.
After filtering out non-English and non-movie subtitles, resulting in 97,709 movies' subtitles and metadata.
Keyword search: Password. 9 lines before and after the keyword (That said, 19 lines in total). 5,982 scenes in total
Qualitativly analyze 50 scenes by 2 authors with MaxQDA.
Analyze Password Topics and Password Attacks
Record all passwords in the subtitles and measure their strength using zxcvbn and Password Guessability Service (PGS).
Compare their strength with The Popular-200 (200 most used passwords of 2023), The Ignis 1M wordlist, and A list from the RockYou data breach.
Watch part of 21 moives to get deeper qualitative insights.
Results
Types of password-contained movies, password behavior in movie scenes, password life cycle, password sharing, and Security Best Practices.
Depiction of Password Attacks
Password guessing, where a human actively guesses candidate password based on frequent passwords, specific knowledge about a person known old passwords. (220 cases) Password hacking, where other techniques are used to obtain the password such as social engineering or shoulder surfing or using automated tools for (brute-force) guessing. (63 cases)
My personal thoughts
Pros
Leverage data from an unexpected source (movie subtitles) to analyze a real world problem (password)
Cons
The logics of relecting password-related behavior in movies to real-world behavior is not so concrete. Qualitative analysis is kinda subjective and may not be generalizable.
Last updated