Authorship attribution. Foundations and Trends in Information Retrieval 2006; 1(3):233–334.
Inference in an authorship problem. Journal of the American Statistical Association 1963; 58(302):275–309.
Authorship analysis: identifying the author of a program. Technical report CSD-TR-94-030, Department of Computer Sciences, Purdue University, West Lafayette, Indiana, May 1994..
Source code authorship attribution. PhD Thesis, School of Computer Science and Information Technology, RMIT University, Melbourne, Australia, November 2010..
Efficient plagiarism detection for large code repositories. Software: Practice and Experience 2006; 37(2):151–175.
, , .
“Uni cheats racket”: a case study in plagiarism investigation. In Proceedings of the Sixth Australasian Computing Education Conference, Lister R, Young A (eds). Australian Computer Society, Dunedin, New Zealand, 2004; 357-365..
Software development marketplaces—implications for plagiarism. In Proceedings of the Ninth Australasian Computing Education Conference, Mann S, Simon (eds). CPRIT, Australian Computer Society: Ballarat, Australia, 2007; 27-33.
, , .
SCO sues Big Blue over Unix, Linux. CNET News.com, 2003. (Available from: http://news.com.com/2100-1016-991464.html) [Accessed 4 October 2007].
Temporally robust software features for authorship attribution. In Proceedings of the Thirty-Third Annual IEEE International Computer Software and Applications Conference, Hey T, Bertino E, Getov V, Liu L (eds). IEEE Computer Society Press: Seattle, Washington, 2009; 599-606., , .
A complexity measure. IEEE Transactions on Software Engineering 1976; SE-2(4):308–320.
Natural laws controlling algorithm structure? ACM SIGPLAN Notices 1972; 7(2):19–26.
Extraction of Java program fingerprints for software authorship identification. Journal of Systems and Software 2004; 72(1):49–57.
Source code authorship analysis for supporting the cybercrime investigation process. In Proceedings of the First International Conference on E-business and Telecommunication Networks, Filipe J, Belo C, Vasiu L (eds), Institute for Systems and Technologies of Information, Control and Communication. Kluwer Academic Publishers: Setubal. 2004; 85–92., , .
Authorship analysis: identifying the author of a program. Computers and Security 1997; 16(3):233–257.
Software forensics for discriminating between program authors using case-based reasoning, feed-forward neural networks and multiple discriminant analysis. In Proceedings of the Sixth International Conference on Neural Information Processing, Gedeon T, Wong P, Halgamuge S, Kasabov N, Nauck D, Fukushima K (eds). IEEE Computer Society Press: Perth, Australia, 1999; 66–71., , , .
A probabilistic approach to source code authorship identification. In Proceedings of the Fourth International Conference on Information Technology, Latifi S (ed.). IEEE Computer Society Press: Las Vegas, NV, 2007; 243–248.
, , , .
Using code metric histograms and genetic algorithms to perform author identification for software forensics. In Proceedings of the Ninth Annual Conference on Genetic and Evolutionary Computation, Thierens D (ed.), ACM Special Interest Group on Genetic and Evolutionary Computation. ACM Press, London, 2007; 2082–2089.
Detecting outsourced student programming assignments. Journal of Computing Sciences in Colleges 2008; 23(3):50–57., .
On the use of discretised source code metrics for author identification. In Proceedings of the First International Symposium on Search Based Software Engineering, Harman M, Di Penta M, Poulding S (eds). IEEE Computer Society Press: Windsor, 2009; 69–78.
, , , .
Source code authorship attribution using n-grams. In Proceedings of the Twelfth Australasian Document Computing Symposium, Spink A, Turpin A, Wu M (eds). RMIT University: Melbourne, 2007; 32–39., .
Source code author identification based on n-gram author profiles. In Artificial Intelligence Applications and Innovations, Maglogiannis IG, Karpouzis K, Bramer M (eds). Springer: New York City, NY, 2006; 508–515.
, , , .
Supporting the cybercrime investigation process: effective discrimination of source code authors based on byte-level information. In Proceedings of the Second International Conference on E-business and Telecommunication Networks, Filipe J, Vasiu L (eds), Institute for Systems and Technologies of Information, Control and Communication. INSTICC Press: Reading, 2005; 283–290., , .
Application of information retrieval techniques for source code authorship attribution. In Proceedings of the Fourteenth International Conference on Database Systems for Advanced Applications, Zhou X, Yokota H, Kotagiri R, Lin X (eds). Springer: Brisbane, 2009; 699–713.
, , .
A probabilistic model of information retrieval: development and comparative experiments part 1. Information Processing and Management 2000; 36(6):779–808.
, , .
A probabilistic model of information retrieval: development and comparative experiments part 2. Information Processing and Management 2000; 36(6):809–840.
, , .
Search Engine Group. About Zettair, RMIT University, October 2009. (Available from: http://www.seg.rmit.edu.au/zettair/about.html [Accessed 27 April 2010]).
Flex Project. flex: the fast lexical analyser, February 2008. (Available from: http://www.flex.sourceforge.net [Accessed 31 May 2010]).
Effective identification of source code authors using byte-level information. In Proceedings of the Twenty-Eighth International Conference on Software Engineering, Osterweil LJ, Rombach D, Soffa ML (eds), ACM Special Interest Group on Software Engineering. ACM Press: Shanghai, China, 2006; 893–896., , , .
Identifying authorship by byte-level n-grams: the source code author profile (SCAP) method. International Journal of Digital Evidence 2007; 6(1):1–18., , , , .
Examining the significance of high-level programming features in source code author classification. Journal of Systems and Software 2008; 81(3):447–460.
, , , .
IDENTIFIED (integrated dictionary-based extraction of non-language-dependent token information for forensic identification, examination, and discrimination): a dictionary-based system for extracting source code metrics for software forensics. In Proceedings of the Third Software Engineering: Education and Practice International Conference, Purvis M, Cranefield S, MacDonell SG (eds), IEEE Computer Society. Technical Communication Services: Dunedin, 1998; 252–259.
, , .
IDENTIFIED: software authorship analysis with case-based reasoning. In Proceedings of the Fourth International Conference on Neural Information Processing and Intelligent Information Systems, Kasabov N (ed.). Asian Pacific Neural Network Assembly. IEEE Computer Society Press: Dunedin, 1997; 53–56., , , , .
Authorship analysis: identifying the author of a program. Technical Report TR-96-052, Department of Computer Sciences, Purdue University, West Lafayette, Indiana, September 1996., .
WinZip Computing. Winzip—the zip file utility for windows, May 2009. (Available from: http://www.winzip.com [Accessed 12 May 2009]).
Discriminant analysis. Biometrics 1979; 35(1):69–85.
Approximate nearest neighbours: towards removing the curse of dimensionality. In Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing, Vitter J (ed.), ACM Special Interest Group on Algorithms and Computation Theory. ACM Press: Dallas, TX, 1998; 604-613., .
Exhedra Solutions Inc. Planet Source Code, March 2010. (Available from: http://www.planet-source-code.com [Accessed 11 March 2010]).
SpotSigs: robust and efficient near duplicate detection in large web collections. In Proceedings of the Thirty-First Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seng Chua T, Kew Leong M, Myaeng SH, Oard DW, Sebastiani F (eds), ACM Special Interest Group on Information Retrieval. ACM Press: Singapore City, 2008; 563–570., , .
Managing Gigabytes: Compressing and Indexing Documents and Images, 2nd edn. Morgan Kaufmann Publishers: San Francisco, CA, 1999., , .
Music ranking techniques evaluated, In Proceedings of the Twenty-Fifth Australasian Computer Science Conference, Oudshoorn M, Pose R (eds). Australian Computer Society: Melbourne, 2002; 275–283., .
Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Publishers: San Francisco, CA, 2005., .
Does SVM really scale up to large bag of words feature spaces? In Proceedings of the Seventh International Symposium on Intelligent Data Analysis, Berthold MR, Shawe-Taylor J, Lavrac N (eds). Springer: Ljubljana, 2007; 296–307.
, , , .
Object Oriented Programming Using C++, 4th edn. Course Technology: Boston, MA, 2008..
International Standardization Organization and International Electrotechnical Commission. Programming languages — C++. International standard 14882, Information Technology Industry Council, New York City, NY, September 1998.
A Book on C, 4th edn. Addison Wesley Longman: Reading, MA, 1997., .
Introduction to Java Programming: Comprehensive Version, 6th edn. Pearson Education Inc.: Upper Saddle River, New Jersey, 2006..
JavaTech: An Introduction to Scientific and Technical Computing with Java, 1st edn. Cambridge University Press: New York City, NY, 2005.
, , .
A taxonomy for programming style. In Proceedings of the Eighteenth ACM Annual Conference on Cooperation, Sood A (ed.), Association for Computing Machinery. ACM Press: New York City, NY, 1990; 244–250.
The Elements of C Programming Style, 1st edn. R. R. Donnelley & Sons: New York City, NY, 1992., .
Software Engineering Metrics and Models, 1st edn. Benjamin/Cummings Publishing Inc.: Redwood City, CA, 1986., , .
Data referencing: an empirical investigation. Computing 1979; 12(12):50–59.