Successful Search Engineer candidate will able to design and develop multi-stage complex data ingestion pipelines, deploy, operate in Cloudera CDH platform, performance tuning and optimizing, log analysis, issue resolution and continuous improvement activity. Successful candidate will be able to maintain and administer existing enterprise search with multiple complex datasets consisting of structured, unstructured and semi-structured data that has multiple 99% and above performance SLA. The candidate should possesses latest Solr skills including but not limited to graph query, streaming expressions, have strong experience in handling data with JSON, NLP, Java, Python, Cloudera kites SDK and other open source tools. Experience in Elastic Serarch, Cloudera SDX, Machine Learning skills is a plus. Should have experience in DevOps platform, tools, programming & scripting experience in Linux & Windows platform, testing complex applications, automation of processes, and have agile skills with sprint releases. The candidate able to adopt to client needs, must be able to work independently and also with other teams, guide them, must possess strong written & oral communication skills and collaboration skills.
- BS in Computer Science or related and 4 - 8 years of prior relevant, or equivalent work experience. Masters with 2 - 6 years of prior relevant experience.
- 8-12-10 years of general software development experience
- 3+ years of enterprise search application development and administration experience with Cloudera Search, Apache Lucene/Solr 7, Elastic Search or Microsoft FAST search engines
- Should have understanding and working experience on search engine administration tasks such as creating collections, configuring document processing, managing search profiles and managing index profiles
- Should be able to integrate search engine content with various data sources including MS-SQL database, file system and streaming content.
- Experience in linguistic processing functions such as lemmatization, spell checking, search relevancy tuning, customization of dictionaries, stop words and synonyms
- Software development experience with a highly-scalable, distributed, large multi-node environment and address scale and performance problems
- Experience in Flume, Morphlines, Kites SDK, Hue, and Hadoop map-reduce programming
- Strong programming experience in Java, multithreaded programming & other OOP languages
- Experience developing data-centric tools, pipelines and applications with Python, Sqoop, Zookeeper, Avro, Pig and Luigi/Oozie
- Experience in implementing web services API, data contract with JSON, XML, CSV
- Ability to perform code reviews and recommend automated review tools for the project
- Ability to think through performance requirements for a system and come out with testing those scenarios and troubleshoot server runtime issues
- Strong knowledge of software implementation best practices