Briefing
|
- Deanonymizing Developers – Two researchers from Drexel University and George Washington University have developed and trained machine learning system to identify programmers by their work after being fed with few samples of code
- Code Identifiers – Programmed AI to identify 50 features in code samples that can be used to differentiate one developer from another
- AI Training – Trained with code samples from Google’s annual Code Jam competition
- Accuracy – Algorithm correctly identified 100 programmers with 96% accuracy, and 600 programmers with 83% accuracy
- Other Insights – Experienced developers are easier to identify than novice programmers, with algorithm’s accuracy at deanonymizing 62 programmers increasing to 95% when asked to solve hard problems, compared to 90% accuracy with easy problems
- Implications – Can identify plagiarizing students, hackers, and developers behind censorship circumvention tools, while creating privacy implications for contributors of coding community platforms, such as Github
|