Enhancements improve data analyst & scientist productivity with up to 7x performance increase and low-code approach for building, curating and sharing data products
Starburst, the analytics anywhere company, today at its third annual Datanova conference announced platform updates that significantly speed performance while reducing barriers to increased data consumption. The combination of these features delivers efficiency to organizations looking to do more with less – spanning analytics, AI and ML workloads.
To make data more easily consumable, Starburst has been working to deliver a low-code solution for building, sharing, and curating data products across global data. Following last year’s launch of features for building data products in Starburst Enterprise, Starburst is now announcing the private preview of data products and an automated data catalog with the ability to search and discover data across sources in its cloud offering Starburst Galaxy. The new capability includes automated metadata from roles, user queries, and other user behaviors such as adding a new dataset. These features follow the launch of key data and schema discovery and data privileges capabilities that streamline the traditional Extract, Transform, Load (ETL) process, announced at re:Invent in November. These new UI and search components bring a marketplace experience to make data products easy to find and consume, dramatically increasing data analyst & scientist productivity.
Starburst is also investing in the Python ecosystem, providing a familiar environment to data scientists to work with their favorite language. With the ability to access Starburst Enterprise and Starburst Galaxy using Python, data scientists can now use their favorite tools to access the same infrastructure and data as the rest of the organization. These enhancements were built in response to our customers’ desire to migrate PySpark workloads to Starburst & Trino to improve performance, and they can now do so without rewriting their code. Combined with recently introduced Fault-Tolerant Execution, these new features enable data engineers and data scientists to build more accurate and agile models, on more data, with higher long-running query success rates.
Lastly, Starburst is announcing Warp Speed, a smart indexing and caching solution that accelerates queries up to 7X, available in private preview in Starburst Galaxy, and generally available in Starburst Enterprise by end of February. Its patented indexing technology autonomously identifies and caches the most used or most relevant data based on usage pattern analysis, while the rest of the data remains close to the source, optimizing data lake performance. This type of acceleration strategy eliminates the manual burden of selecting what data in the data lake to optimize and cache.
“Complexity across the data infrastructure is slowing time to insights and limiting the impact of data across organizations,” said Ali Huselid, SVP of Product, Starburst. “Starburst is focused on simplifying and accelerating data consumption by streamlining the delivery of data products. Today’s platform updates significantly raise the performance bar for data lake analytics, improving data engineer and scientist productivity while driving faster, more reliable insights across the enterprise.”