My goal was to group a list of financial transactions by year and month based on their booking date and display them later with a charts library in a Blazor application to get insights into my personal spending. The entries are stored using Entity Framework Core (EF Core). This is the first time I tried to group entries with LINQ in EF Core using GroupBy.
I started with using the GroupBy overload with a single key selection function. The official documentation of this extension method was confusing to me due to the high amount of different overloads and I had trouble to figure out how to use the result. JetBrains Rider actually displayed the correct overload in the autocomplete dialog and helped with understanding this. After that, I took a bit of a deeper dive to learn it a bit more.
And as this is my first blog entry, I just want to see how I like writing in this form. Perhaps, these notes will even help me in the future 🙂 To focus on the grouping operation, I am using a simpler example in the form of books stored in a SQLite database without any graphs instead of financial transactions.
To follow along, you need a console
project with the following dependencies installed.
- Microsoft.EntityFrameworkCore (version 7.0.13)
- Microsoft.EntityFrameworkCore.Sqlite (version 7.0.13)
- Microsoft.Extensions.DependencyInjection (version 7.0.0)
- Microsoft.EntityFrameworkCore.Design (version 7.0.13)
Additionally, nullable reference types are enabled in the project:
GroupBy on IQueryable<T>
Let’s start off with a Book model. It contains some basic information like the title, author name and publishing date. The goal is to group the published books by year and month on the publishing date and count the amount of books per group. Additionally, the overall example consists only of the associated DbContext and some basic plumbing. Please note, that this is simplified code with a focus on the GroupBy operation and may not be following best-practices.
This model needs an associated DbContext
. For convenience, I configure the database directly in the OnConfiguring method. This will create a test.db
file in the working directory of the running program. Depending on your IDE or editor, this may be the build output directory or the solution's directory.
Book
and BookContext
definitions must follow after all statements, see: https://learn.microsoft.com/en-us/dotnet/csharp/fundamentals/program-structure/top-level-statements#namespaces-and-type-definitions. Further code snippets must be added before the class definitions above.The above context is configured in the DI provider and instantiated. As a DbContext is a scoped service, a scope must be created and used to instantiate a BookContext instance.
After defining the model and context, the migrations have to be created with dotnet ef migrations add InitialCreate
. In our case, it's just a single migration creating the SQLite table for our Book
model. What migrations are, how they are created, and how they are applied, can be found in the official documentation of EF Core. You may need to install the Entity Framework Core tools if you haven't before.
No project was found. Change the current working directory or use the --project option.
Now, we can add some example books:
Now, we can query the books we created. One example would be to determine how many books have been published in a specific month.
We start off by grouping over the books via the Books
attribute which implements IQueryable<Book>
of the bookContext
:
One of the overloads allows us to specify a key selector, i.e., a method which – given an entity – returns a key which should be used to perform the grouping operation.
Another key selection function could also extract the year in addition to the month. This will be useful if your data spans multiple years. To combine multiple attributes into a single key, you can use anonymous types:
new { book.PublishDate.Year, book.PublishDate.Month }
is a shorthand for new { Year = book.PublishDate.Year, Month = book.PublishDate.Month }
.As anonymous types in C# implement Equals
and HashCode
in terms of their properties, the grouping operation has a way to compare different keys.
How can we now use the result?
For me it only clicked, after I understood, that the result of a GroupBy
invocation with a single key selection function is an IQueryable
whose elements each contain the grouped elements and the associated key. Or, to quote the official documentation I missed at first:
AnIQueryable<IGrouping<TKey, TSource>>
in C# orIQueryable(Of IGrouping(Of TKey, TSource))
in Visual Basic where eachIGrouping<out TKey,out TElement>
object contains a sequence of objects and a key.
Let's unpack that by continuing our example.
As the result of GroupBy
is still an IQueryable
, we can use LINQ operations on the result produced by the grouping operation. For example, we can follow up with a simple Select
operation.
Each element this Select
now operates on is a group produced by the GroupBy
operation. As seen in the official documentation, each element is of type IGrouping<TKey, TSource>
.
TKey
is the return type of our key selector function which in our case is the anonymous type with Year
and Month
attributes.
TSource
is the type of elements coming into the grouping operation. In our case TSource
is Book
, as we are operating directly on the Books
attribute of our BookContext
.
Then, the important thing is that this same IGrouping
also implements IEnumerable
and, thus, can be used get all elements of that group – for example by using a basic ToList
.
We can also use other extensions methods of IEnumerable
like Count
to reach our goal of finding out the amount of published books in a given year and month:
This produces the expected result, as we have two books published in February and one book published in March in our example data.
Where to go from here?
GroupBy
also has other overloads with different return types. For example, there is a variant which accepts a result selector in addition to the key selector. This would shorten our solution above by combining both the GroupBy
and Select
operations:
Other overloads also accept an element selector which allows changing the type which goes into the groups in the end as well as a custom comparison function for keys.
ToArray
, ToList
, or foreach
without additional filtering of values from the database in production as they will load the full table into memory. Approaches could be implementing paging or further grouping the result.Additionally, as grouping is an expensive operation, you should try to limit the amount of elements going into the
GroupBy
operation as far as the question you want to answer allows it. In our example, we could limit the results to a specific year with a Where
operation beforehand.If you see that you don't need all the data going into the grouping operation, you can also try using a
Select
which extracts the data before the group operation or the associated GroupBy
overload.It always depends on your specific use-case and data, though.
The final source code of the example can be found at the end of this article.
GroupBy on IEnumerable<T>
GroupBy
extension methods are also implemented for the IEnumerable<T>
type. It can be used like the IQueryable<T>
implementation.
Using a simplified example, we can create an array (which also implements IEnumerable
):
See also
I encountered grouping operations in other languages as well, here's just a collection of useful links:
- Grouping in MongoDB aggregation pipelines: https://www.mongodb.com/docs/upcoming/reference/operator/aggregation/group/
- Rust's
itertools
: https://crates.io/crates/itertools
For my use-case, I also used the Blazor-ApexCharts library (https://www.nuget.org/packages/Blazor-ApexCharts).
Full source code of the example
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.DependencyInjection;
// As we don't need a web server or similar
// in this examle, just construct a service
// provider without anything else.
var services = new ServiceCollection();
services.AddDbContext<BookContext>();
var provider = services.BuildServiceProvider();
// As mentioned, a scope is needed as `AddDbContext`
// registers the context as a scoped service.
using var scope = provider.CreateScope();
var bookContext = scope.ServiceProvider.GetService<BookContext>()!;
// Re-initialize the database on each start.
bookContext.Database.EnsureDeleted();
bookContext.Database.Migrate();
bookContext.Books.AddRange(
new Book
{
Title = "Book A",
AuthorName = "Author A",
PublishDate = new DateOnly(2023, 2, 1),
},
new Book
{
Title = "Book B",
AuthorName = "Author A",
PublishDate = new DateOnly(2023, 2, 1),
},
new Book
{
Title = "Book C",
AuthorName = "Author B",
PublishDate = new DateOnly(2023, 3, 1),
}
);
bookContext.SaveChanges();
var groupsOneOperation = bookContext.Books
.GroupBy(
book => new { book.PublishDate.Year, book.PublishDate.Month },
(key, books) => new
{
key.Year,
key.Month,
Books = books.Count()
}
);
foreach (var group in groupsOneOperation)
{
Console.WriteLine(group);
}
// { Year = 2023, Month = 2, Books = 2 }
// { Year = 2023, Month = 3, Books = 1 }
class Book
{
public int Id { get; set; }
public required string Title { get; set; }
public required string AuthorName { get; set; }
public required DateOnly PublishDate { get; set; }
}
class BookContext : DbContext
{
// This attribute is set by EF Core.
public DbSet<Book> Books { get; set; } = default!;
// Usually, one would use IOptions<T> or similar
// constructs to obtain the connection options.
protected override void OnConfiguring(DbContextOptionsBuilder options)
=> options.UseSqlite("Data Source=test.db");
}
GroupBy on IQueryable<T> in LINQ
This is a short introduction into GroupBy in LINQ. Two of the overloads are introduced on an example of using books stored in SQLite via Entity Framework Core.