You just unlocked $/£/€ 150 off a workshop. Use code BFCM26 at checkout to reserve your spot at the lowest price yet.

Unlock our largest short course discount of the year. Use code BFCM26* during your call with admissions. Start now. *T&Cs apply

You just unlocked 4 new courses. Apply between now and Dec 31 to waive your application fee*. Start now. *T&Cs apply

    Get More Info

    That class has ended. See what else is coming up below.  

    Applying Bayesian Methods to Clustering Models (feat. Memorial Sloan Kettering)

    Online Campus

    Online
    Anywhere
    Online

    Past Locations for this Event

    Applying Bayesian Methods to Clustering Models (feat. Memorial Sloan Kettering) | Online

    Online Campus

    Online
    Anywhere
    Online

    Past Locations for this Event

    About this event

    Tentative Schedule: 6:30pm: Pizza + Beer networking 7:00pm: TBD with Data Scientist at Dataiku 7:30pm: A Bayesian Approach To Model Overlapping Objects Available As Distance Data with Sandhya Prabhakaran, Researc Fellow at Memorial Sloan Kettering Cancer Centre

    Talk Abstracts: A Bayesian Approach To Model Overlapping Objects Available As Distance Data with Sandhya Prabhakaran, Researc Fellow at Memorial Sloan Kettering Cancer Centre: Traditional clustering methods often partition objects into mutually exclusive clusters - however, it's more realistic that objects may belong to multiple, overlapping clusters. When healthcare data is available in pairwise distances -- such as in genomic string alignments, protein contact maps, or pairwise patient similarities - there is no probabilistic clustering model that allows such overlap, and solutions for these types of models are often noisy and heavily biased. Therefore, it would be advantageous to have a model which caters to clustering distance data directly.

    In this talk, we'll address this problem and introduce a Probabilistic model for Overlapping Clustering on Distance data (POCD) that gives objects the freedom to belong to one or more clusters at the same time. Since POCD is a probabilistic model, on output we obtain samples from a distribution over partitions and use an Indian Buffet Process (IBP) beforehand to remove the need to pre-emptively fix the number of overlapping clusters. We will demonstrate the benefits of working with distances directly and the utility of POCD in both simulated as well as real world distance data of neonatal patients and HIV1 protease inhibitor contact maps.

    (This is joint work with Julia E. Vogt (Department of Computer Science, ETH, Switzerland and Swiss Institute of Bioinformatics (SIB), Basel, Switzerland))

    Coming up near you

    Let’s Keep You Updated

    Enter your email to start following

    I have read and acknowledge General Assembly's Privacy Policy and Terms of Service. SMS message and data rates may apply.