Hackers of India

InfoSec Deep Learning in Action

By  Satnam Singh  on 06 Mar 2020 @ Nullcon


Presentation Material

Abstract

In the last few years, when the cybercrooks have speeded their execution plan on making quick money by ransomware attacks. The question for the defenders is what we can do to defend against them? The cybersecurity industry is pitching heavily to leverage deep learning to combat these threats. In this talk, I will cut short the hype and discuss the reality of what deep learning can do for information security (InfoSec)? I will share use cases, data pipeline, algorithms, and code along with challenges faced while deploying them in the wild. This session is aimed to share both breadth and depth about applying machine learning/deep learning for InfoSec.

AI Generated Summarymay contain errors

Here is a summary of the content:

Identity and Purpose The speaker, and purpose are not explicitly stated, but based on the context, it appears that the speaker is an expert in machine learning and AI, particularly in the field of cybersecurity.

Content Summary

  1. The speaker talks about building a REST API for summarizing content, where user feedback will be stored.
  2. They mention the importance of considering various aspects, such as user feedback, to create a product or solution.
  3. The speaker then shifts focus to red teaming, which involves simulating cyber attacks on an organization’s computer systems to test their defenses.
  4. They discuss several use cases for machine learning in red teaming, including:
    • Automated phishing attacks: using scripts to collect data about company employees and constructing a classifier to predict who is likely to fall prey to phishing attacks.
    • Password guessing: building deep learning neural networks that can generate passwords based on available datasets (e.g., from Kaggle).
  5. The speaker mentions their work at Akalvio, where they have developed a product that uses machine learning, AI, and deception to help companies detect hidden threats in their networks.

Q&A Session

  1. A question is asked about dealing with unstructured data having many entities, specifically how to find what’s statistically relevant.
  2. The speaker responds by suggesting the use of log processing tools (e.g., ELK, Splunk) to convert unstructured data into structured form and then applying feature selection techniques from data science, such as information gain, to identify relevant attributes.

The speaker concludes by offering to take further questions and interact with the audience.