I have a column in my database containing employee start dates, but they’re stored as text strings in various formats (e.g., “2023-05-15”, “15/05/2023”, “May 15, 2023”). How can I use Power Query (Power BI Transform) to extract these dates into a consistent date format for analysis in Power BI?
Convert strings to Dates in Power Query
A known best practice is to try to execute data cleansing tasks as close as possible to the data source. This tutorial will guide you through using Power Query to transform inconsistent date strings into a standardized date format, enabling accurate time-based analysis in Power BI.
Step-by-Step Instructions:
- Open Power BI Desktop and load your EmployeeRecords table.
- Go to Home > Transform data to open Power Query Editor.
- Select the StartDate column.
- Go to Add Column, then select Date and last Date From Text.
- In the function dialog, use the following M code:
= Date.FromText([StartDate], [Culture="en-US"])
- Rename the new column to “StartDateFormatted”.
- If some dates fail to convert, create a custom column with this M code:
= try Date.FromText([StartDate]) otherwise
try Date.FromText([StartDate], [Culture="en-GB"]) otherwise
try Date.FromText([StartDate], [Culture="en-US"]) otherwise null
- Replace the original StartDate column with the new formatted date column.
- Close and Apply your changes.
Remember:
- The Power Query Date.FromText() function attempts to parse various date formats automatically.
- Using multiple culture parameters helps handle different regional date formats.
- The ‘try-otherwise’ pattern provides fallback options for parsing dates.
Troubleshooting:
If dates are still not parsing correctly:
- Go ahead and check for leading/trailing spaces in the original text. Use Trim() to remove them.
- Alternatively, look for inconsistent separators (e.g., both “-” and “/”). Use Replace() to standardize.
- Another option is to create a custom function using Text.Split() and Date.FromParts().
- If a significant number of dates fail to parse, consider using column quality distribution to identify patterns in the problematic data.