I have the following values in a column. We assume that we know what the categories are, and there are many products with different names. The order is strict and there is an id column. The only things we know are the names of the categories and that the products that follow each category are part of that category.
|Column |ID|
|------------------------------|--|
|Category-Phones |1 |
|Apple iPhone 15 Pro Max 256GB |2 |
|Samsung Galaxy S23 Ultra 256GB|3 |
|Motorola Razr+ 256GB |4 |
|Category-Wearables |5 |
|Google Pixel Watch |6 |
|Fitbit Sense 2 |7 |
|Garmin Venu 3 with Silicone B.|8 |
|Category-Phone Accessories |9 |
|Apple MagSafe Battery Pack |10|
|Tile Mate Essential (2022) 4P.|11|
|Apple Clear Case iPhone 11Pro |12|
and I would like to get the data in two columns:
|Categories |Products |
|----------------------------------|---------------------------------|
|Category-Phones |Apple iPhone 15 Pro Max 256GB |
|Category-Phones |Samsung Galaxy S23 Ultra 256GB |
|Category-Phones |Motorola Razr+ 256GB |
|Category-Wearables |Google Pixel Watch |
|Category-Wearables |Fitbit Sense 2 |
|Category-Wearables |Garmin Venu 3 with Silicone Band |
|Category-Mobile Phone Accessories |Apple MagSafe Battery Pack |
|Category-Mobile Phone Accessories |Apple Clear Case for iPhone 11Pro|
|Category-Mobile Phone Accessories |Tile Mate Essential (2022) 4-Pack|
How could I do this in SQL Server?
2条答案
按热度按时间vnjpjtjt1#
This is still indicative of a serious schema problem. Hopefully you intend to use the solution to resolve that issue?
Assuming you have the ID column, and it's always true that the rows that do not start with 'Category-' are products that belong to the preceding category:
So here we find the category rows (1, 5 and 9) (in the
CTE
) and then join back to the table to get all rows that are greater than the category ID and less than the NEXT category ID.This will only work when the conditions are true, and is BEGGING to have a proper relationship defined for it. Hopefully you are just trying to sanitize the data to achieve that.
| CategoryID | ProductID | CategoryName | ProductName |
| ------------ | ------------ | ------------ | ------------ |
| 1 | 2 | Category-Phones | Apple iPhone 15 Pro Max 256GB |
| 1 | 3 | Category-Phones | Samsung Galaxy S23 Ultra 256GB |
| 1 | 4 | Category-Phones | Motorola Razr+ 256GB |
| 5 | 6 | Category-Wearables | Google Pixel Watch |
| 5 | 7 | Category-Wearables | Fitbit Sense 2 |
| 5 | 8 | Category-Wearables | Garmin Venu 3 with Silicone B. |
| 9 | 10 | Category-Phone Accessories | Apple MagSafe Battery Pack |
| 9 | 11 | Category-Phone Accessories | Tile Mate Essential (2022) 4P. |
| 9 | 12 | Category-Phone Accessories | Apple Clear Case iPhone 11Pro |
pgccezyw2#
First, data in a SQL Server table has no implied order, so you need to unambiguously define one. I have added an IDENTITY value to your data for this purpose.
From there, selecting the products is straight forward using a
LIKE
operator. As for choosing the associated Category, you can use theOUTER APPLY (SELECT TOP 1 ... ORDER BY ...)
pattern to select the best matching row - in this case the closest preceding "Category" row.If a product appears before the first category, the above would return a null CategoryName.
Results:
| ProductName | CategoryName |
| ------------ | ------------ |
| Product1 | Category1 |
| product2 | Category1 |
| Product3 | Category1 |
| Product4 | Category2 |
| Product5 | Category2 |
| Product6 | Category2 |
| Product7 | Category3 |
| Product8 | Category3 |
| Product9 | Category3 |
See this db<>query