The minds behind the magic

The minds behind the magic

    Get More Info
    Applying Bayesian Methods to Clustering Models (feat. Memorial Sloan Kettering)

    New York City campuses

    GA NYC (Manhattan), HQ
    10 E 21st St, 3rd Floor
    New York NY 10010

    GA NYC (Manhattan), Classrooms
    10 East 21st Street
    New York NY 10010

    Past Locations for this Event

    Applying Bayesian Methods to Clustering Models (feat. Memorial Sloan Kettering) | New York City

    New York City campuses

    GA NYC (Manhattan), HQ
    10 E 21st St, 3rd Floor
    New York NY 10010

    GA NYC (Manhattan), Classrooms
    10 East 21st Street
    New York NY 10010

    Past Locations for this Event

    About this event

    Tentative Schedule: 6:30pm: Pizza + Beer networking 7:00pm: TBD with Data Scientist at Dataiku 7:30pm: A Bayesian Approach To Model Overlapping Objects Available As Distance Data with Sandhya Prabhakaran, Researc Fellow at Memorial Sloan Kettering Cancer Centre

    Talk Abstracts: A Bayesian Approach To Model Overlapping Objects Available As Distance Data with Sandhya Prabhakaran, Researc Fellow at Memorial Sloan Kettering Cancer Centre: Traditional clustering methods often partition objects into mutually exclusive clusters - however, it's more realistic that objects may belong to multiple, overlapping clusters. When healthcare data is available in pairwise distances -- such as in genomic string alignments, protein contact maps, or pairwise patient similarities - there is no probabilistic clustering model that allows such overlap, and solutions for these types of models are often noisy and heavily biased. Therefore, it would be advantageous to have a model which caters to clustering distance data directly.

    In this talk, we'll address this problem and introduce a Probabilistic model for Overlapping Clustering on Distance data (POCD) that gives objects the freedom to belong to one or more clusters at the same time. Since POCD is a probabilistic model, on output we obtain samples from a distribution over partitions and use an Indian Buffet Process (IBP) beforehand to remove the need to pre-emptively fix the number of overlapping clusters. We will demonstrate the benefits of working with distances directly and the utility of POCD in both simulated as well as real world distance data of neonatal patients and HIV1 protease inhibitor contact maps.

    (This is joint work with Julia E. Vogt (Department of Computer Science, ETH, Switzerland and Swiss Institute of Bioinformatics (SIB), Basel, Switzerland))

    Coming up near you

    Let’s Keep You Updated

    Enter your email to start following

    I have read and acknowledge General Assembly's Privacy Policy and Terms of Service. SMS message and data rates may apply.