tag page

I feel lucky to have had the privilege of attending the AWS INNOVATE – AI/ML Edition Conference for the first time in my career. This conference, which was conducted on the 24th of February 2020, was not only highly informational but would very much inspire any data-driven company like Norconex to venture into performing their AI/ML projects on the cloud. The conference had sessions for users of various levels of expertise. I was hopeful that this experience would enrich my level of insight and help the team at Norconex to build our SaaS products.

AWS innovate bannerThe event kicked-off with a keynote delivered by Denis V Batalov, WW Technical Leader for AI/ML at AWS. This session showcased the application of state-of-the-art computer vision techniques and advances in machine learning and their applications in autonomous vehicles, IoT, Software development, and many more. The video clip that Batalov shared showed how they leveraged computer vision and AWS to automate their Amazon Fulfillment center seemed magical. He later moved on to shed some light on how we can use AI/ML at work and listed some of the latest AI/ML products released by AWS. The conference was planned to be conducted on various tracks which meant I had to pick and choose between numerous engaging sessions happening in parallel. Luckily, on-demand videos are made available on the AWS Innovate website which could be watched later.

Following the keynote, I jumped into the session “Prepare your Datasets at Scale using Apache Spark and SageMaker Data Wrangler.” The speaker, Chris Fregly – Developer Advocate, AI/ML at AWS, explained how we could take advantage of the distributed processing capabilities of Apache Spark using AWS SageMaker Data Wrangler. Using this combination, he illustrated how the inbuilt features of Data Wrangler empower data collection, preprocessing, and feature engineering. Fascinatingly, they have out-of-the-box components that can detect class imbalance, bias, correlation, feature importance, and many more characteristics. Using SparkML, we can train the model in parallel spark nodes and take way less time than traditional methods. At Norconex, we use AWS and Spark in our projects, and this is undoubtedly a takeaway that we could explore.

After this session, I hopped into a session by Antje Barth – Senior Developer Advocate, AI/ML at AWS on “Automating ML workflows with end-to-end pipelines.” She walked us through the ML-OPS capabilities of SageMaker Pipeline in setting up ML projects in the world of CI/CD. Another exciting part was SageMaker Model Monitor’s use to watch the model after being deployed to production. I also attended a session on AWS Security for ML by Shelbee Eigenbrode – AI/ML Specialist Solutions Architect AWS. I learned how AWS had progressed its security features in AWS transcribe to remove or protect PII data during data collection. This option could be useful for Norconex when we crawl through content involving PII data. Here is a glimpse of the AWS ML services stack which I captured in one of the sessions.

AWS ML stack

My most anticipated part of the conference was the session on “Intelligent Search” by Ryan Peterson – Enterprise Search Expert at AWS. He illustrated how Amazon Kendra, an AWS search service, houses Intelligent Search capabilities. Kendra’s core capabilities involve Natural Language Querying, Natural Language Understanding, models trained with domain-specific data, continuous improvement via user feedback, secure search (TLS and encryption), and many more. Ryan also emphasized how the workforce in various organizations wastes precious work time by looking for content and how it can be saved by using an Intelligent search. As a developer at Norconex – a pioneer in Enterprise Search, I would wholeheartedly concur. We are focused on intelligent search, which certainly offers a vast improvement over conventional search techniques.

The online/virtual conference ended with closing remarks by Chris Fregly, who provided a handy summary of the new AWS ML/AI services and their capabilities. I am happily overwhelmed by the learning experience I had at the event and eagerly look forward to participating in more quality events like this. 

Amazon Web Services (AWS) and the Canadian Public Sector organized another excellent Public Sector Summit on May 15, 2019. AWS hosted the first such summit in Ottawa last year, but this year’s event attracted a much larger crowd. Thousands of attendees filled Shaw Centre’s entire third floor.

In the keynote sessions, it was great to hear Alex Benay (deputy minister at the Treasury Board of Canada) talk about the government’s modern digital initiative. He discussed the approach, successes, and challenges of the government’s Cloud migration journey. Another excellent speaker was Mohamed Frendi (director of IT, innovation, science, and economic development for the government of Canada). He covered Canada’s API Store and how it uses the Cloud to make government data more accessible.

The afternoon session was led by Darin Briskman, an AWS developer evangelist. He talked about Amazon’s self-service analytics tool, called AWS Lake Formation, which combines data from multiple sources to resolve data-driven challenges in a timely manner. Machine learning and AI help in making informed decisions and solving problems. This service is a great fit for Norconex’s open-source crawler products HTTP Collector and Filesystem Collector, which fetch data from unstructured data sources to make it easy to consume. Collected content and metadata are natively stored in various existing repositories (or formats), including AWS-specific ones like Amazon Elasticsearch Service, Amazon Open Distro Elasticsearch, and Amazon CloudSearch, as well as many others, such as relational databases, Apache Solr, Google Cloud Search, Neo4J, Microsoft Azure Search, Lucidworks, IDOL, and more.

 

The diagrams below provide further explanation. The one showing the crawling spider is particularly exciting, because Norconex crawlers have much potential to help in this area.  See available Norconex Committers.

     

 

AWS Public Sector Summit Event Pass

Selfies with Darin Briskman, Developer Evangelist, AWS and Stevan Beara, Solutions Architect Manager, AWS.