In this Learn Python video presented by Flávio Juvenal da Silva Junior from digital PyCon, we will learn more about Record Deduplication and how exactly it works using Python Software. There will be instances where we encounter duplicate records in our datasets. While these duplicate records can be noticed easily by humans, computers may not be able to recognize them due to some variations. In this case, what we need is the Record Deduplication technique.
Generally, this technique works like a charm by joining records in a fuzzy way using data like names, addresses, phone numbers, dates, and more. In this video, we will discover how this technique managed to help various companies and government agencies. The talk will also highlight the most common workflow for the process and what are the algorithms involved. To learn more about record deduplication with Python, feel free to watch the video below and see how this technique works.