Skip to content

Add Melt method #7577

@sevenzees

Description

@sevenzees

Is your feature request related to a problem? Please describe.

Working with DataFrames in wide format is often inconvenient for analysis and visualization. I'm frequently frustrated when I need to transform data from wide format (multiple columns representing different variables) to long format (rows representing observations). Currently, there's no built-in way to "unpivot" or "melt" a DataFrame in the .NET DataFrame API, which means I have to write complex manual loops to restructure my data. This is a common operation in data analysis workflows, especially when preparing data for charting libraries or performing grouped aggregations.

Describe the solution you'd like

I'd like a Melt() method similar to Pandas' pandas.melt() function (https://pandas.pydata.org/docs/reference/api/pandas.melt.html) that transforms a DataFrame from wide format to long format. The method should:

  • Accept identifier columns (idColumns) that remain as columns in the output
  • Accept value columns (valueColumns) to unpivot, or automatically use all non-ID columns if not specified
  • Allow customization of the variable column name (defaults to "variable") that will contain the original column names
  • Allow customization of the value column name (defaults to "value") that will contain the unpivoted values
  • Support a dropNulls parameter to exclude null or empty values from the result
  • Handle mixed data types across value columns by converting to string when necessary
  • Validate inputs to prevent invalid configurations (overlapping ID/value columns, empty column lists, etc.)

Describe alternatives you've considered

I have written application level code to do this, but it is such a common use case, that I think it makes sense to include it in the DataFrame where everyone can use it.

Additional context

This feature would bring the .NET DataFrame API closer to feature parity with popular data analysis libraries like Pandas (Python) and tidyr (R). The melt operation is fundamental for "tidy data" principles and is commonly used for:

  • Preparing data for time series visualization
  • Reshaping survey or experimental data where each column represents a measurement
  • Converting measurement matrices into observation tables
  • Preparing data for statistical modeling that expects long-format inputs

Example use case:

// Original wide format
// | ID | Name  | Q1_Sales | Q2_Sales | Q3_Sales | Q4_Sales |
// |----|-------|----------|----------|----------|----------|
// | 1  | North | 1000     | 1200     | 1100     | 1300     |
// | 2  | South | 800      | 900      | 950      | 1000     |

var melted = df.Melt(
    idColumns: new[] { "ID", "Name" },
    valueColumns: new[] { "Q1_Sales", "Q2_Sales", "Q3_Sales", "Q4_Sales" },
    variableName: "Quarter",
    valueName: "Sales"
);

// Result: long format suitable for charting
// | ID | Name  | Quarter   | Sales |
// |----|-------|-----------|-------|
// | 1  | North | Q1_Sales  | 1000  |
// | 1  | North | Q2_Sales  | 1200  |
// | 1  | North | Q3_Sales  | 1100  |
// | 1  | North | Q4_Sales  | 1300  |
// | 2  | South | Q1_Sales  | 800   |
// | 2  | South | Q2_Sales  | 900   |
// | 2  | South | Q3_Sales  | 950   |
// | 2  | South | Q4_Sales  | 1000  |

This would significantly improve the DataFrame API's usability for data transformation workflows.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestuntriagedNew issue has not been triaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions