This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. We're covering it all. You won't want to miss it.
Learn moreDid you hear? There's a new SQL AI Developer certification (DP-800). Start preparing now and be one of the first to get certified. Register now
Introducing multiple-schema inferencing in Eventstream! This feature empowers you to work seamlessly with data sources that emit varying schemas by inferring and managing multiple schemas simultaneously. It eliminates the limitations of single-schema inferencing by enabling more accurate and flexible transformations, preventing field mismatches when switching between Live and Edit modes, and allowing you to view and correct inferred schemas directly. You can now preview data in a structured, schema-specific format and confidently configure transformation paths without being blocked by schema ambiguity or inaccurate inference - resulting in a smoother, more accurate authoring experience across schema-diverse data streams.
A schema is required for data transformation design in Eventstream, as each transformation operator requires data fields (columns) for configuration. Without a schema, operators cannot be configured for transformation.
In Eventstream, schema inferencing is based on the data previewed from sources and Eventstream itself. Previously, Eventstream inferred schema solely from the first row of the previewed data, even when the data sample included multiple schemas. This single-schema inference approach presents several challenges:
In addition, the inferred schema cannot be viewed or updated. If a field's data type is inferred incorrectly, you cannot correct it, which prevents you from using that field for transformation properly.
Schemas are inferred by analyzing the complete set of data previewed from both sources and eventstreams within a specific time range, not just from the first row. If the previewed data contains multiple schemas, then multiple schemas will be inferred, and the previewed data or test results will be organized according to the inferred schema. If there is no data in source or eventstream, or if the source doesn’t support data preview, there won’t be any schemas inferred. If the previewed data changes (e.g., new fields added, data type changes, etc.), new schema will be inferred.
If operators have previously been configured with certain schema in eventstream, the schema used for that operator configuration will be retained when publishing the eventstream. When re-entering Edit mode, this retained schema remains applied to the operators. This approach addresses authoring errors that arise when the inferred schema differs from the one used in operator configurations or if no schema is inferred.
The introduction of multiple-schema inferencing in Eventstream addresses the previously mentioned challenges, enabling data integration and processing in environments with complex and varied data structures. To use this feature, you need to enable it in your eventstream.
Enhancing_Data_Transformation_Flexibility_with_Multiple-Schema_Inferencing_in_Ev
When configuring the first operator node following the default stream, it is essential to select one of the inferred schemas. This selection enables the transformation path to be designed using data fields (columns) from the selected schema. Within a single Eventstream, distinct transformation paths can utilize different schemas for data processing, enhancing the flexibility and adaptability of data transformation workflows.
Enhancing_Data_Transformation_Flexibility_with_Multiple-Schema_Inferencing_in_Ev
Multiple-schema inferencing enables a clearer and more organized view of previewed data and test results. Previously, data containing multiple schemas was displayed with mixed columns, making it difficult to interpret. With this new feature, you can select a specific inferred schema to filter the preview or test results, ensuring that only data matching the selected schema is shown. This makes it much easier to understand your streaming data and confidently design accurate transformation paths.
Enhancing_Data_Transformation_Flexibility_with_Multiple-Schema_Inferencing_in_Ev
This new feature preserves the schema applied in transformation paths, e.g., operators, after the eventstream is published. By introducing this capability, authoring errors that previously appeared on transformation paths in single-schema inferencing eventstream - when no matching schema was present in Edit mode - are eliminated. You can now continue to adjust operator configurations in transformation paths and publish Eventstream even if the newly inferred schema does not align with the one used in operator configurations or if no schema is inferred upon re-entering Edit mode.
Enhancing_Data_Transformation_Flexibility_with_Multiple-Schema_Inferencing_in_Ev
The inferred schema(s) within an Eventstream can be reviewed and verified across multiple interfaces. If any data types for specific fields are incorrectly inferred, this feature provides you with the capability to make necessary adjustments. For instance, if a particular field (column) contains only ‘null’ values in all previewed rows, Eventstream cannot accurately determine its data type and will default to inferring it as Record. However, the intended data type may actually be Integer, with the field occasionally containing values. If you wish to perform transformations such as applying a filter condition like ‘value is greater than xxx’, you would be unable to do so while the data type remains set as Record. With this enhancement, you can correct the data type to ‘Integer’, thereby enabling appropriate configuration of such filters.
Enhancing_Data_Transformation_Flexibility_with_Multiple-Schema_Inferencing_in_Ev
In conclusion, the introduction of multiple-schema inferencing in Eventstream marks a significant advancement in data transformation flexibility. By enabling the inference and utilization of multiple schemas, you can now design diverse transformation paths that cater to complex and varied data structures with greater ease and precision. This enhancement not only addresses the challenges previously faced with single-schema inference but also improves the overall efficiency and accuracy of data processing workflows. As a result, Eventstream users can look forward to a more streamlined and adaptable data integration experience, ultimately driving better outcomes for their data analytics projects. To learn more about this feature, please visit Enhancing events processing with multiple-schema inferencing.
Get started with a free trial of Microsoft Fabric today! If you have any questions, please contact us via email at askeventstreams@microsoft.com. You are also welcome to provide feedback or submit feature requests on Fabric Ideas, and participate in discussions with other users in the Fabric Community.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.