Adding Multinomial and Nested Logit Models for Consumer Choice #1654

NathanielF · 2025-04-27T16:06:26Z

Description

I'm adding two new model classes for discrete choice style models that I intend to be part of the consumer choice module.
As it stands i'm opening the PR as a draft for discussion around the implementation choices and API design I have for these models.

Related to this issue: #1653. There is a lot of potential in the discrete choice style models for Bayesian modelling in particular, the state of the art models in this domain involves a mixed logit parameterisation for which "vanilla" implementations are pretty straightforward using Bayesian hierarchical parameterisations.

Two New Models

Main things to flag: There are now two new model files in the consumer choice folder. The simple Multinomial Logit and the Nested Logit. As outlined in the issue i've restricted the nested logit to no more than two layers of nesting. I believe this will bring us to beyond parity with packages like mlogit in R and pylogit in python which allow for only 1 level deep nesting structures.

API Discussion

The API i'm suggesting for these models differs from the typical X,y inputs on the models in pymc marketing in general. Mostly this is because I feel the use of Wilkinson style notation here is important. For instance this is how you specify the Nested Logit Model currently:

We assume a wide-data input as well:

Causal Inference and Counterfactuals

The value that these models bring is their focus on causal inference. The entire history of discrete choice models stems effectively from the observation that multinomial logit models cannot support plausible counterfactuals around market interventions (due to IIA) and more sophisticated discrete choice models like the nested logit models are able to solve this. See for instance here how a pricing intervention on a multinomial logit results in proportional re-allocation of market share to the rest of the market.

We demonstrate this problem and solution by adding 2 new notebooks to the gallery.

In the second notebook for nested logit we show how the IIA is solved by this extra nesting structure:

Fixed Attributes and Alternative Specific Attributes

One thing i've done is to ensure that the models can identify parameters for the alternative specific attributes (e.g. price) and the individually fixed attributes e.g. (income). I've done my best to benchmark the parameter identification and recovery against R's mlogit package:

How to Proceed?

I have not done an extensive write up of the math behind these types of models and some of the functions need more documentation and tests. But I wanted to share what I have so far to generate discussion and maybe decide on how to proceed. One immediate improvement i could think of would be to remove duplication from the nested logit and multinomial logit model classes, making them instances of a more general "DISCRETE CHOICE" class where we could re-use e.g. the formula parsing functions. Additionally i'd like to benchmark the parameter identification with a second data set and example.

Longer term i think there is room for adding a vanilla mixed-logit example too.

Anyway, open to feedback. Adding a draft PR now to check which linting, and testing failures i have.

Related Issue

Closes #
Related to Adding Discrete Choice Models to Consumer Choice Module #1653

Checklist

Checked that the pre-commit linting/style checks pass. Feel free to comment pre-commit.ci autofix to auto-fix.
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks) using numpydoc format.
If you are a pro: each commit corresponds to a relevant logical change

📚 Documentation preview 📚: https://pymc-marketing--1654.org.readthedocs.build/en/1654/

…erface Signed-off-by: Nathaniel <[email protected]>

Signed-off-by: Nathaniel <[email protected]>

review-notebook-app · 2025-04-27T16:06:31Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

codecov · 2025-04-27T16:09:10Z

Codecov Report

Attention: Patch coverage is 93.56618% with 35 lines in your changes missing coverage. Please review.

Project coverage is 93.40%. Comparing base (d41e74e) to head (f96a856).
Report is 6 commits behind head on main.

Files with missing lines	Patch %	Lines
pymc_marketing/customer_choice/nested_logit.py	92.72%	24 Missing ⚠️
pymc_marketing/customer_choice/mnl_logit.py	94.85%	11 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1654      +/-   ##
==========================================
+ Coverage   93.39%   93.40%   +0.01%     
==========================================
  Files          56       58       +2     
  Lines        6329     6873     +544     
==========================================
+ Hits         5911     6420     +509     
- Misses        418      453      +35

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Nathaniel <[email protected]>

williambdean · 2025-04-29T04:04:06Z

Can this be done with bambi?

NathanielF · 2025-04-29T10:30:28Z

Some aspects of the multinomial logit I believe so, but not the separate utility equations with fixed covariates and the nested logit cannot be done with Bambi.

NathanielF · 2025-04-29T16:56:41Z

Maybe to make that a little clearer @williambdean the Multinomial logit in this discrete choice implementation is related to the more standard multinomial regression you will find in Bambi, but it differs is an important way that is rooted in the utility theory behind the modelling enterprise. The model is conceptualised as involving "drivers" of the utility for each of the products on a market, so within the model we have N linear models which represent the utility of that good - where each of the models takes attributes of the product alternative as features (rather than shared attributes). By allowing these distinct "alternative specific" and "individual specific" covariates we attempt to model a choice-scenario and the covariates have a specific interpretation under scenario.

Standard multinomial regression models don't make that distinction and so can't be interpreted in the same way.

Do you think it would help the PR if i put more of this background in the notebooks?

NathanielF added 16 commits April 18, 2025 22:37

working on parameter identification multinomial logit and formula int…

ede6186

…erface Signed-off-by: Nathaniel <[email protected]>

allowing for the incorporation of fixed covariates

262cc74

Signed-off-by: Nathaniel <[email protected]>

tidying formula parser

1a6a3ac

Signed-off-by: Nathaniel <[email protected]>

adding intervention functionality and plotting

e59de36

Signed-off-by: Nathaniel <[email protected]>

add skeleton mnl notebook to gallery

374e698

Signed-off-by: Nathaniel <[email protected]>

working on the nested logit

0397226

Signed-off-by: Nathaniel <[email protected]>

updating nested logit notebook

3212cd3

Signed-off-by: Nathaniel <[email protected]>

fixed nested logit with fixed covariates

2ae4b0f

Signed-off-by: Nathaniel <[email protected]>

generalising pre-processing for 3 level nesting

57f0159

Signed-off-by: Nathaniel <[email protected]>

identified three level nesting

978b6ef

Signed-off-by: Nathaniel <[email protected]>

working three level model

2eb4d60

Signed-off-by: Nathaniel <[email protected]>

defining w_nest within each nest

82912b2

Signed-off-by: Nathaniel <[email protected]>

working 2 and 3 level nesting

34b8308

Signed-off-by: Nathaniel <[email protected]>

update gallery

1c096c4

Signed-off-by: Nathaniel <[email protected]>

Adding some tests for nested logit

ad00eb5

Signed-off-by: Nathaniel <[email protected]>

tidying notebook

2e41156

Signed-off-by: Nathaniel <[email protected]>

github-actions bot added docs Improvements or additions to documentation tests customer choice Related to customer choice module labels Apr 27, 2025

NathanielF added 9 commits April 27, 2025 19:02

fix majority of linting errors and update tests

b222e1e

Signed-off-by: Nathaniel <[email protected]>

fix linting and formatting

025626b

Signed-off-by: Nathaniel <[email protected]>

run ruff on test files

21c0521

Signed-off-by: Nathaniel <[email protected]>

run ruff on notebooks

7b352fb

Signed-off-by: Nathaniel <[email protected]>

run ruff format

5bf96fb

Signed-off-by: Nathaniel <[email protected]>

improve test coverage

e2b93ab

Signed-off-by: Nathaniel <[email protected]>

fix key error

6db033b

Signed-off-by: Nathaniel <[email protected]>

update multinomial notebook and test coverage

d343faf

Signed-off-by: Nathaniel <[email protected]>

format test

7d9fddb

Signed-off-by: Nathaniel <[email protected]>

update nested logit sample test

41747ef

Signed-off-by: Nathaniel <[email protected]>

NathanielF self-assigned this Apr 28, 2025

NathanielF added 12 commits April 28, 2025 13:53

allowing no fixed covariates

5763a1d

Signed-off-by: Nathaniel <[email protected]>

updating nested logit notebook

672843b

Signed-off-by: Nathaniel <[email protected]>

running ruff format

dec47e5

Signed-off-by: Nathaniel <[email protected]>

ruff check and fix

24a65a0

Signed-off-by: Nathaniel <[email protected]>

run UML and add plot test

cc0ce40

Signed-off-by: Nathaniel <[email protected]>

run pre-commit checks

b79a24e

Signed-off-by: Nathaniel <[email protected]>

resolving uml conflict with main

5100ce0

Signed-off-by: Nathaniel <[email protected]>

adding more test coverage

0a97a3c

Signed-off-by: Nathaniel <[email protected]>

adding fit kwargs in nested logit notebook

200f85c

Signed-off-by: Nathaniel <[email protected]>

remove compare code from notebook

0125115

Signed-off-by: Nathaniel <[email protected]>

remove compare code from notebook

6b1dd4e

Signed-off-by: Nathaniel <[email protected]>

update test coverage

4e4d6c5

Signed-off-by: Nathaniel <[email protected]>

NathanielF marked this pull request as ready for review April 28, 2025 22:04

Merge branch 'main' into discrete_choice_module

f96a856

drbenvincent self-requested a review May 2, 2025 12:32

NathanielF mentioned this pull request May 9, 2025

API change for the SyntheticControl experiment class pymc-labs/CausalPy#460

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding Multinomial and Nested Logit Models for Consumer Choice #1654

Adding Multinomial and Nested Logit Models for Consumer Choice #1654

NathanielF commented Apr 27, 2025 •

edited

Loading

Uh oh!

review-notebook-app bot commented Apr 27, 2025

Uh oh!

codecov bot commented Apr 27, 2025 •

edited

Loading

Uh oh!

williambdean commented Apr 29, 2025

Uh oh!

NathanielF commented Apr 29, 2025

Uh oh!

NathanielF commented Apr 29, 2025 •

edited

Loading

Uh oh!

Uh oh!

Adding Multinomial and Nested Logit Models for Consumer Choice #1654

Are you sure you want to change the base?

Adding Multinomial and Nested Logit Models for Consumer Choice #1654

Conversation

NathanielF commented Apr 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Two New Models

API Discussion

Causal Inference and Counterfactuals

Fixed Attributes and Alternative Specific Attributes

How to Proceed?

Related Issue

Checklist

Uh oh!

review-notebook-app bot commented Apr 27, 2025

Uh oh!

codecov bot commented Apr 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

williambdean commented Apr 29, 2025

Uh oh!

NathanielF commented Apr 29, 2025

Uh oh!

NathanielF commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

NathanielF commented Apr 27, 2025 •

edited

Loading

codecov bot commented Apr 27, 2025 •

edited

Loading

NathanielF commented Apr 29, 2025 •

edited

Loading