BSidesLV 2017 has ended
Back To Schedule
Wednesday, July 26 • 11:00 - 11:25
Building a Benign Data Set

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Though featurization is important, the datasets used to make conclusions are just as important, if not more so. Information Security researchers often cannot release data, resulting in lack of benchmark datasets and causing cross-dataset generalization to be understudied in this domain. Despite this fact, presence of dataset bias (especially negative set bias) is now common knowledge in machine learning for malware classification. For these reasons, we have developed a standard for benign datasets to be used toward machine learning in the malware classification domain. We are also releasing a sample benign data set designed to minimize these problems.

avatar for Rob Brandon

Rob Brandon

Security Researcher, Booz-Allen-Hamilton
Rob is currently a security researcher with the Booz-Allen Hamilton Dark Labs. He has over a decade of experience in the security field, primarily in the areas of network traffic analysis, forensics, reverse engineering, and machine learning. Rob holds a PhD in Computer Science... Read More →
avatar for John Seymour

John Seymour

University of Maryland, Baltimore County
John Seymour is a Senior Data Scientist at ZeroFOX, Inc. by day, and Ph.D. student at University of Maryland, Baltimore County by night. He researches the intersection of machine learning and InfoSec in both roles. He’s mostly interested in dataset bias (seriously, do people still... Read More →

Wednesday July 26, 2017 11:00 - 11:25 PDT
Ground Truth (Firenze) 255 E Flamingo Rd, Las Vegas, NV 89169