


Hossein Hassani, Christina Beneki, Emmanuel Sirimal Silva, Nicolas Vandeput, and Dag Øivind Madsen. The science of statistics versus data science: what is the future? Technological Forecasting and Social Change, 173:121111, 2021.


Amit Datta, Michael Carl Tschantz, and Anupam Datta. Automated experiments on ad privacy settings: a tale of opacity, choice, and discrimination. 2015. arXiv:1408.6491.


Muhammad Ali, Piotr Sapiezynski, Miranda Bogen, Aleksandra Korolova, Alan Mislove, and Aaron Rieke. Discrimination through optimization. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW):1–30, nov 2019. URL:, doi:10.1145/3359301.


Reuters. Amazon scraps secret AI recruiting tool that showed bias against women. October 2018. URL: (visited on 2023-06-05).


Kirsten Martin. Ethical Implications and Accountability of Algorithms. Journal of Business Ethics, 160(4):835–850, December 2019. URL: (visited on 2023-06-09), doi:10.1007/s10551-018-3921-3.


Solon Barocas and Andrew D. Selbst. Big Data's Disparate Impact. California Law Review, 104(3):671–732, 2016. Publisher: California Law Review, Inc. URL: (visited on 2023-06-09).


Catherine D'Ignazio and Lauren F. Klein. Data feminism. <Strong> ideas series. The MIT Press, Cambridge, Massachusetts ; London, England, 2020. ISBN 978-0-262-04400-4.


C. Stinson. Algorithms are not neutral. AI Ethics, 2:763–770, 2022. doi:10.1007/s43681-022-00136-w.


Sina Fazelpour and David Danks. Algorithmic bias: Senses, sources, solutions. Philosophy Compass, 16(8):e12760, 2021. _eprint: URL: (visited on 2023-06-09), doi:10.1111/phc3.12760.


Shoshana Zuboff. The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. Profile Books, 1st edition, 2019. ISBN 9781781256848.


Solon Barocas, Moritz Hardt, and Arvind Narayanan. Fairness and Machine Learning: Limitations and Opportunities. MIT Press, 2023.


Cathy O'neil. Weapons of math destruction: How big data increases inequality and threatens democracy. Crown, 2017.


Mike Loukides, Hilary Mason, and D. J. Patil. Ethics and Data Science. O'Reilly Media, 1st edition edition, July 2018.


Vladimir Pletser and Dirk Huylebrouck. The Ishango Artefact: the Missing Base 12 Link. Forma, 1999.


Philip Russom and others. Big data analytics. TDWI best practices report, fourth quarter, 19(4):1–34, 2011.


Martin Frické. The knowledge pyramid: a critique of the DIKW hierarchy. Journal of Information Science, 35(2):131–142, April 2009. Publisher: SAGE Publications Ltd. URL: (visited on 2024-04-22), doi:10.1177/0165551508094050.


Ralf Otte, Boris Wippermann, Sebastian Schade, and Viktor Otte. Von Data Mining bis Big Data: Handbuch für die industrielle Praxis. Carl Hanser Verlag GmbH & Co. KG, München, 1 edition, July 2020. ISBN 978-3-446-45550-4.


Karl Popper. Karl Popper: Logik der Forschung. Akademie Verlag, July 2013. ISBN 978-3-05-006378-2. URL: (visited on 2023-09-04), doi:10.1524/9783050063782.


Thomas Bartelborth. Die erkenntnistheoretischen Grundlagen induktiven Schließens. Universität Leipzig, 2017. URL:


Keith Lehrer. Theory Of Knowledge: Second Edition. Routledge, New York, 2 edition, September 2019. ISBN 978-0-429-49426-0. doi:10.4324/9780429494260.


Max Boisot and Agustí Canals. Data, information and knowledge: have we got it right? Journal of Evolutionary Economics, 14(1):43–67, January 2004. URL: (visited on 2024-04-22), doi:10.1007/s00191-003-0181-9.


Claus Weihs and Katja Ickstadt. Data Science: the impact of statistics. International Journal of Data Science and Analytics, 6(3):189–194, November 2018. URL: (visited on 2023-09-05), doi:10.1007/s41060-018-0102-5.


David Donoho. 50 Years of Data Science. Journal of Computational and Graphical Statistics, 26(4):745–766, October 2017. Publisher: Taylor & Francis _eprint: URL: (visited on 2023-09-05), doi:10.1080/10618600.2017.1384734.


Rüdiger Wirth and Jochen Hipp. Crisp-dm: towards a standard process model for data mining. In Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining, volume 1, 29–39. Manchester, 2000.


Hilary Mason and Chris Wiggins. Dataists » A Taxonomy of Data Science. 2010. URL: (visited on 2023-09-05).


Philip Guo. Data Science Workflow: Overview and Challenges. 2022. URL: (visited on 2023-09-05).


Justin Matejka and George Fitzmaurice. Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, CHI '17, 1290–1294. New York, NY, USA, May 2017. Association for Computing Machinery. URL: (visited on 2023-09-07), doi:10.1145/3025453.3025912.


Thomas Piketty. Capital in the Twenty-First Century:. Belknap Press, Cambridge, MA, April 2014. ISBN 978-0-674-43000-6.


David Lane, David Scott, Mikki Hebl, Rudy Guerra, Dan Osherson, and Heidi Zimmer. Introduction to Statistics. Citeseer, 2003. URL:


Peter Bruce and Andrew Bruce. Practical Statistics for Data Scientists. O′Reilly, Beijing Boston Farnham Sebastopol Tokyo, 1 edition, June 2017. ISBN 978-1-4919-5296-2.


F. J. Anscombe. Graphs in Statistical Analysis. The American Statistician, 27(1):17–21, February 1973. Publisher: Taylor & Francis _eprint: URL: (visited on 2024-04-22), doi:10.1080/00031305.1973.10478966.


Amit Saxena, Mukesh Prasad, Akshansh Gupta, Neha Bharill, Om Prakash Patel, Aruna Tiwari, Meng Joo Er, Weiping Ding, and Chin-Teng Lin. A review of clustering techniques and developments. Neurocomputing, 267:664–681, 2017.


Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. Isolation forest. In 2008 eighth ieee international conference on data mining, 413–422. IEEE, 2008.


Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. Lof: identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, 93–104. 2000.


Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 2008.


Leland McInnes, John Healy, and James Melville. Umap: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018.


Allison Marie Horst, Alison Presmanes Hill, and Kristen B Gorman. palmerpenguins: Palmer Archipelago (Antarctica) penguin data. 2020. R package version 0.1.0. URL:, doi:10.5281/zenodo.3960218.


Narendra Kumar Gupta, Giuseppe Di Fabbrizio, and Patrick Haffner. Capturing the stars: predicting ratings for service and product reviews. In HLT-NAACL 2010. 2010.


Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. October 2013. arXiv:1310.4546 [cs, stat]. URL: (visited on 2023-06-12), doi:10.48550/arXiv.1310.4546.


Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. September 2013. arXiv:1301.3781 [cs]. URL: (visited on 2023-06-12), doi:10.48550/arXiv.1301.3781.


Keith McNulty. Welcome \textbar Handbook of Graphs and Networks in People Analytics: With Examples in R and Python. Routledge & CRC Press, 2022. ISBN 978-1-03-220497-0. URL: (visited on 2024-01-23).


Executable Books Community. Jupyter Book. February 2020. URL: (visited on 2023-06-06), doi:10.5281/zenodo.4539666.
