Sage Journals: Discover world-class research

Abstract

Developing responsive dynamic marketing strategies can be challenging in the absence of complete customer information, such as share of wallet, limiting the ability of the firm to target promotions and other marketing efforts with a view to optimizing customer lifetime value (CLV). Furthermore, much of the existing research on CLV treats customers as receivers, rather than co-creators, of services. We address these two key challenges by developing a reinforcement learning (RL)-based promotion optimization model to determine which promotion strategies are most suitable for targeting different customer groups. Specifically, using feedback derived from customers’ real-time transactional responses to promotional campaigns, we present an RL algorithm that (a) continually refines the estimated effectiveness of promotions, aligning them to customers’ preferences to maximize their CLV, and that (b) supports value co-creation by involving customers as active participants to enhance their service experience. We demonstrate the effectiveness of the model through simulation scenarios within the context of a ferry travel agency, providing evidence of its real-world potential.

Graphical Abstract

Keywords

customer lifetime value resource allocation optimization learning algorithm value co-creation

Introduction

Promotional campaigns are an important component of marketing strategies, and by targeting specific customer segments with tailored promotions (e.g., discounts, coupons, loyalty programs), businesses can optimize the effectiveness of their marketing (von Mutius and Huchzermeier 2021). However, measuring the long-term effectiveness of firms’ marketing activities (e.g., promotional campaigns) and developing responsive dynamic marketing strategies can be challenging, especially without complete information on customers’ buying behavior (e.g., on the company’s share of customers’ wallets). Many firms, such as the airline Emirates, have access to a rich body of information on their customers’ purchasing behaviors, including the recency and frequency of purchases, and the user journey on the company website and mobile app (e.g., Line et al. 2020); however, obtaining information on customers’ transactions with competing firms, such as United Airlines, is difficult. This limitation forces most companies to rely on an inward view when managing customer relationships, making it challenging to target promotions and other marketing efforts to achieve the best return on investment (Du, Kamakura, and Mela 2007). For example, when resources are limited, firms may choose to target highly profitable customers with promotions (Reinartz and Kumar 2000); yet, if some of those customers already spend their entire wallet with the firm, promotional campaigns such as discounts may have no effect on increasing their expenditure, and revenue is lost on their customary sales. Thus, it is telling that the literature notes that firms tend to overspend on retaining high-value customers and underspend on lower-value ones (Ovchinnikov, Boulu-Reshef, and Pfeifer 2014).

Encouraging a collaborative mechanism between the company and its customers, formally known as the concept of “value co-creation,” can address some of these challenges (Prahalad and Ramaswamy 2004): It can help the company understand its customers’ needs and preferences and align its services accordingly, and marketers are increasingly recognizing co-creation for its important role in marketing strategy (Ad Age Studio 30 2023).

Enhancing individual customers’ lifetime value (CLV) remains a challenge in marketing, especially when key variables such as share of wallet are not directly observable. Studies engaging in and encouraging CLV-oriented research efforts abound in a diverse range of marketing fields, including those focusing on the management of customer equity (Bell et al. 2002; Drèze and Bonfrer 2009; Karvanen, Rantanen, and Luoma 2014; Kim, Boo, and Qu 2018), the quality of the company–client relationship and its recovery (Stakhovych and Tamaddoni 2020), customer relationship management (Berger et al. 2006; Dew et al. 2024; Libai et al. 2020; Salisbury et al. 2023), big data issues (Line et al. 2020; Ram and Zhang 2022), customer segmentation (Kanchanapoom and Chongwatpol 2023), customer engagement (Bijmolt et al. 2010; Manosuthi, Lee, and Han 2021; Meire et al. 2019), loyalty and loyalty programs (Ascarza et al. 2018; Terblanche 2015; Yoo, Bai, and Singh 2020), marketing strategy (McManus 2013), and pricing strategies (Talón-Ballestero, Nieto-García, and Gonzáles-Serrano 2022), among others.

However, much of the existing research on CLV still treats customers as receivers, rather than co-creators, of services. One of the reasons for this gap may lie in how firms approach the issue of customer knowledge, a key factor underpinning firms’ engagement with customers (Kumar et al. 2010). Specifically, there may be a need to expand the traditional notion of how firms define and apply feedback from customers to add value to the firm—what Kumar et al. (2010, p. 299) call “customer knowledge value (CKV).” Specifically, the classic approach to CKV focuses on engaging with consumers directly and in their networks (e.g., in brand communities) in order to generate what might be termed explicit feedback, such as user-submitted ideas for innovations, customer survey data, and customers’ online complaint and product review activity (Kumar et al. 2010). These applications of CKV may miss the opportunity to capture implicit feedback: ongoing behavior-driven signals that reflect how customers dynamically respond to promotional offers, product experiences, or service interactions.

We propose that CKV should also include implicit feedback derived from real-time transactional responses (see Figure 1)—for example, how customers react to promotional campaigns such as redemptions, and the effectiveness of different promotions with different customer groups. Unlike traditional transactional behavior, real-time interactions go beyond the monetary value of sales and focus on the knowledge generated from customer responses when interpreted as a signal of preference or motivation. This expansion of CKV requires a methodology aimed at capturing, interpreting, and acting on dynamic responses, as traditional analytical models tend to be static in nature, do not learn from ongoing behavioral patterns, and fail to update decisions in real time.

Figure 1.

Adaptive promotion strategy via CKV and value co-creation framework.

To address this, and as represented in Figure 1, we develop a reinforcement learning (RL) algorithm along with an optimization model that continuously updates promotions based on knowledge of customer behavior and needs—a form of value co-creation that enhances the service experience for the customers while maximizing CLV for the firm. RL is an ideal methodology when functional relationships (e.g., the effect of different promotions on the number of transactions for different customer groups) are unknown, and important variables (e.g., share and size of wallet) are not easily observable (Sutton and Barto 1998).

The RL algorithm uses real-time customer behavioral responses to applied promotions to update the estimates of the effectiveness of various promotions on different customer groups. This aligns with the concept of value emerging through interactions, and the updated customer response parameters resulting from this are, in turn, used to optimize the marketing strategies to maximize CLV. RL is used in interactive, digital service environments like chatbots and virtual assistants to tailor responses to customer preferences (e.g., they ask a customer to choose between two of their responses). This information is used to tailor the style of their future responses to that customer (i.e., to refine interaction based on user feedback). These systems work in a way that aligns with the principles of CKV, although they have not been framed as such. In the current example, this link is formalized such that the structured optimization model considers business constraints, such as limited resources, ensuring that promotional strategies are in line with customers’ preferences and are also both feasible and cost-effective. The model also encourages co-creation in the optimization by adding customers’ feedback into the equation. Figure 1 summarizes the problem and the approach to the solution.

Our study extends work on CLV by including customers as active participants in shaping future promotional strategies (i.e., co-creators of value) in settings where customer information is incomplete, such as in noncontractual settings where there is limited knowledge about customer behavior (e.g., their share of wallet). The work responds to calls for dynamic planning of multiple promotional campaigns (e.g., Ascarza et al. 2018), for the integration of machine learning models in marketing studies (e.g., Ngai and Wu 2022), and for work that links variables that are not easily observed, such as share and size of wallet, with future-oriented metrics such as CLV for resource allocation problems (Petersen et al. 2009). Thus, we advance service marketing theory by demonstrating how service strategies, and CKV in particular, can be co-created through feedback loops that are both dynamic and data-driven, and by treating customers as value co-creators. We also advance service practice by presenting a methodology for implementing the co-creation of value, illustrating the application of this approach using data collected from a ferry travel agency and simulated scenarios.

In the following sections, we first review related literature and then describe the optimization model for suggesting the best promotion strategies and discuss how it can be adapted for different business rules. We also explain how a learning algorithm is used to update the estimations in the optimization model as new information becomes available. We then outline the application of this approach to a service setting, before discussing the theoretical and managerial implications, along with directions for future research.

Literature Review

CLV and Resource Allocation

We focus on research using CLV as the key outcome metric in resource allocation. Table 1 compares key dimensions of our proposed methodology against related studies in the literature on CLV and promotional resource allocation. Methodological approaches in previous studies differ significantly from our approach, with some using a static optimization approach (Berger and Bechwati 2001; Venkatesan and Kumar 2004; von Mutius and Huchzermeier 2021) while others have provided predictive insights without formally optimizing marketing decisions (Reinartz and Kumar 2003; Rust, Kumar, and Venkatesan 2011). The research most closely related to our work is that of von Mutius and Huchzermeier (2021) and Rust, Kumar, and Venkatesan (2011). Like von Mutius and Huchzermeier (2021), we optimize promotional strategies to maximize CLV, but our methodologies and approaches differ. Our model uses a dynamic learning algorithm to adapt and refine promotional strategies in real time while they employ a more static analytical model to assess the effectiveness of category-specific coupons. Rust, Kumar, and Venkatesan (2011) use Monte Carlo simulation to dynamically predict the probability of the CLV and then decide on marketing efforts based on this prediction. Instead of the CLV driving the promotional campaign, our approach dynamically changes this scenario by measuring the impact of past promotional campaigns on CLV. Monte Carlo simulation models and RL are both used in machine learning, with the former being used more for prediction modeling (e.g., dynamically predicting CLV), whereas the latter is a type of machine learning that allows the decision-maker (e.g., manager) to learn through the consequences of actions (e.g., the effect of a promotion on the number of transactions). Their simulation helps in identifying the most valuable customers rather than optimizing marketing campaigns to maximize CLV.

Table 1.

Resource Allocation Strategies Using CLV.

Authors	Application	Methodology	Insights	Customer Segmentation	Dynamic Approach	Personalization	Multi-Channel Optimization	Empirical Data	Predictive vs. Prescriptive	Value Co-Creation
Reinartz and Kumar (2000)	Catalog retailer	NBD/pareto	Profitability prioritized over tenure	✓	✗	✗	✗	✓	Predictive	✗
Berger and Bechwati (2001)	Conceptual model	Decision calculus	Budget split between acquisition and retention	✗	✗	✗	✓	✗	Prescriptive	✗
Venkatesan and Kumar (2004)	High-tech service	Gamma distribution, genetic algorithms	CLV-based selection enhances profits	✓	✗	✓	✓	✓	Predictive and prescriptive	✗
Reinartz, Thomas, and Kumar (2005)	B2B high-tech	Probit two-stage least squares	Balancing acquisition and retention resources	✓	✗	✗	✗	✓	Prescriptive	✗
Kumar et al. (2008)	IBM case study	Regression, field experiments	CLV-based resource allocation significantly profitable	✓	✗	✓	✗	✓	Prescriptive	✗
Rust, Kumar, and Venkatesan (2011)	High-tech services	Monte Carlo simulation	Improved predictive accuracy for marketing effectiveness	✓	✗	✓	✗	✗	Predictive	✗
Von Mutius and Huchzermeier (2021)	Hypermarket chain	Analytical model	Static analysis of coupon-targeting strategies	✓	✗	✓	✗	✗	Prescriptive	✗
Current study	B2C Ferry travel agency	Reinforcement learning (multi-arm bandit), optimization (Integer linear programming)	Real-time personalized promotional optimization to enhance CLV dynamically	✓	✓	✓	✗	✗ (simulation)	Predictive and prescriptive	✓

As demonstrated in Table 1, our approach extends previous research across methodological and theoretical dimensions, emphasizing in particular dynamic real-time personalization and value co-creation, which remain underexplored in existing literature. Since our framework relies on customer interactions to continuously refine promotional strategies, it is important to explore value co-creation literature in order to better understand the role of customers in shaping the effectiveness of marketing.

Integrating Value Co-Creation in Optimizing Promotional Strategies

Value co-creation originated from service-dominant logic, which states that value is dynamically co-created with customers (Vargo and Lusch 2008), where interactions between the latter and suppliers shape the meaning and relevance of value (Chandler and Vargo 2011). The importance of value co-creation is illustrated in different marketing contexts, including service system design (Moeller et al. 2013), customer loyalty (Cossío-Silva et al. 2016), personalization technologies (Peña-García, Losada-Otálora, and Juliao-Rossi 2022), and multi-channel segmentation (Hosseini, Shajari, and Akbarabadi 2022). Kumar et al. (2010) emphasize the importance of using CKV (feedback, referrals, and participation in innovation) for improving service and product, and thus co-creating value.

Tuunanen et al. (2024) outline five micro-level mechanisms (i.e., social use, customer orientation and decision-making, service experience, service use context, and customer values and goals) to support value co-creation in the design of digital services. Among these mechanisms, customer orientation and decision-making are most relevant to our model, as integrating micro-level mechanisms such as customer orientation (real-time feedback from customers) to guide the decision-making in optimizing promotions facilitates alignment with customers’ current needs.

The two core contributions of this research are, first, the expansion of the CKV concept to include implicit, real-time behavioral feedback, and second, the development and demonstration of a dynamic customer-driven marketing framework that maximizes CLV. This is achieved through the integration of customer segmentation, RL, and optimization techniques. The approach represents an improvement over traditional model-based approaches as it provides a more responsive and collaborative way to enhance the effectiveness of marketing by embedding value co-creation directly into the decision-making process. Specifically, the initial impetus underpinning the approach we present here is a conversation with customers captured through survey data aimed at identifying preferred sources of value, thus making the customer an active contributor to the value creation process. Furthermore, we present a model that provides learning opportunities to both the customers and the firm in the context of the promotions offered by the latter. In other words, the model provides customers with opportunities to experience service offerings that are not their expressed preference (i.e., not indicated by the survey and past behavior), and thus learn new preferences, while in response, the firm can identify promotional offerings that maximize CLV (see Figure 1).

Framework of the Proposed Methodology

The conceptual framework of this research, which has three core steps, is presented in Figure 2. The first step is creating customer groups and calculating their CLV, beginning by segmenting customers based on their historical purchasing data, including factors such as frequency, recency, and monetary value, with the CLV being worked out for each group to estimate their future value. The expected future purchases of each customer group will be used to tailor promotional campaigns more effectively. This step will be described later in the article when applying the framework to an online ferry travel agency.

Figure 2.

A conceptual framework for resource allocation model incorporating a learning algorithm.

The second step is developing an optimization model taking into consideration various factors, such as expected future purchases, the costs of promotions, available resources, and potential returns. The purpose of this model is to allocate marketing resources efficiently and to maximize the overall CLV, that is, finding the right promotion for the right customer. The promotions recommended by the optimization model are then offered to each customer group.

In the third step, customer responses (data on how customers reacted to the promotions and changes in their purchasing behavior) are collected at the end of the promotion period. At this stage, the learning algorithm updates the parameters of the optimization model using the new information on customer responses, incorporating a collaborative process where customer feedback directly influences the optimization of marketing strategies, creating value for both the company and the customers.

Optimization Model of Promotional Strategies

We use the following notation to present a generalizable model. The objective is to maximize the collective CLV through promotional campaigns; that is, we assess which promotional campaign should be offered to each customer group, if any, and for how long.

The optimization model considers the costs of the promotions, the available resources, and the business rules it must follow, and it can alter its optimal decision according to the business rules. In our approach, we have explored three different business rules for offering promotions. With the first rule, promotions are offered exclusively to the most profitable customer groups; this is ideal when communication among customers is minimal, such as in luxury retail contexts, where high-end brands, such as Gucci and Louis Vuitton, often target their most loyal customers with exclusive promotions and discounts. Another example is that of airlines that offer frequent-flyer incentives such as priority boarding, free upgrades, and lounge access.

With the second rule, all customer groups receive the same promotional campaign, but the duration of the promotion can vary based on customer segmentation, as different customer segments may need distinct approaches to increase their engagement, such as longer promotions for less engaged customers. This business rule may be suitable when some communication is expected among customers through forums. For example, Amazon Prime Day is a global shopping event for Prime members, offering the same deals and discounts to all such customers; however, the duration of access (e.g., early access and extended offers) to these deals can vary based on the customers’ engagement level and average spending.

Finally, the third rule is that if a promotional campaign is offered, it needs to be the same for all customer groups. This is ideal when a high level of communication among customers is expected, as it ensures equal treatment in order to maintain satisfaction. For example, a popular retail chain’s customers regularly discuss promotions and sales on social media and forums, so to ensure that all its customers feel equally appreciated, the retailer may offer a site-wide sale with the same discount percentage. While we explore these specific rules to demonstrate the potential of the approach, the model is flexible, and other business rules can be incorporated by adapting the business rule constraint.

Parameters

$N$ : the number of different customer categories, which are grouped based on their frequency and recency of purchases. Each customer type is indexed by $i,$ ranging from 1 to $N$ .

S: the number of segments based on contribution levels. If the contribution levels are not aggregated, each individual customer can be considered a separate segment. These are indexed by s, ranging from 1 to S.

$N S$ : the total number of differentiated customer groups considering both their category and contribution level, $N S = N \times S$ .

$K$ : the number of promotional campaigns (strategies). Each campaign is indexed by $j,$ ranging from 1 to $K$ .

$k_{1}$ : this is used to express different subsets of promotional strategies. The campaigns with a cost proportional to the price are indexed as 1, . . ., $k_{1}$ . Campaigns indexed from $k_{1} + 1$ to $K$ have a fixed cost, such as shopping vouchers.

M: maximum length of time for which a promotional campaign can be applied. The different lengths (durations) are indexed by m, ranging from 1 to M. We use months as a time unit in the following description; however, the model is equally applicable for other time units such as weeks.

$t_{i s j m} :$ the expected increase in the number of transactions by a category i customer with contribution level $s$ if campaign j is applied for a period of m, for m = 1, . . ., M.

$e_{i s m}$ : the expected number of future purchases in m months by a customer in category $i$ with contribution level $s$ , if no promotional campaign is applied.

$c_{j}$ : the proportional cost of applying campaign j for a customer purchase. For example, if a company runs a promotional campaign where they offer a 10% discount on all purchases made by customers as part of the campaign, the proportional cost would be .10.

$f_{j}$ : the fixed cost of applying strategy j to a customer purchase, for example, receiving a £5 voucher.

$a_{i s}$ : the average contribution value (e.g., spending) of transactions for future purchases by a category i customer with contribution level $s$ .

$η_{i s j} :$ factor reflecting the change in contribution by customers in category i with contribution level $s$ when promotion j is applied, for example, $η_{i s j} = 1.05$ means an increase in contribution by 5%.

d: is a discount rate assumed to be constant over the planning period.

$r :$ the profit margin.

$l_{i s j} :$ the person-hours needed for applying strategy j to all customers of category i with contribution level s for a month. This reflects staff time needed in the activities of applying the promotion strategy, such as answering customer queries.

$L : the$ total number of person-hours available for applying marketing strategies.

$T :$ end of the planning period.

Key Assumptions

For ease of modeling, we make several simplified assumptions. First, we use the average past expenditure and transaction frequency of customers within each group as the inputs for our calculations. Second, we consider an exclusive campaign application in which each customer group can receive no more than one promotional campaign at a time; the application analyzes an existing pool of customers. We assume static customer grouping, where customer groups are predefined and do not change during the planning period. This is because dynamically updating customer groups while the model is still learning can make it difficult to tell whether changes in behavior are caused by the promotions or by the shifting of customer segments. For example, if a low-frequency customer receives a promotion and increases their purchase rate, dynamic segmentation might reclassify them as medium-frequency mid-training. We also assume that customers are not strategic, as in they do not change their current behavior in anticipation of promotions. Finally, the expected increase in transactions due to promotions is initially estimated and subsequently updated based on real-time data using the dynamic learning algorithm. It is assumed in the planning horizon that profit margins and the discount rate are constant.

Decision Variables

$x_{i s j m} :$ a binary variable indicating whether to apply campaign j to category i customers with contribution level s for a period of m months (1 if applied, 0 otherwise).

Objective Function

The objective of the model is to maximize the collective CLV through promotional campaigns.

Business Rule 1: Only Profitable Customers are Offered Promotions

Under Business Rule 1, the optimization model can be formulated as follows:

$\begin{matrix} Maximize \sum_{i = 1}^{N} \sum_{s = 1}^{S} \sum_{j = 1}^{K} \sum_{m = 1}^{M} t_{i s j m} \frac{η_{i s j} a_{i s}}{{(1 + d)}^{m}} r x_{i s j m} \\ - \sum_{i = 1}^{N} \sum_{s = 1}^{S} \sum_{j = 1}^{K} \sum_{m = 1}^{M} c_{j} \frac{η_{i s j} a_{i s}}{{(1 + d)}^{m}} (e_{i s m} + t_{i s j m}) x_{i s j m} \\ - \sum_{i = 1}^{N} \sum_{s = 1}^{S} \sum_{j = 1}^{K} \sum_{m = 1}^{M} f_{j} (e_{i s m} + t_{i s j m}) x_{i s j m} \end{matrix}$ (1)

Subject to:

$\sum_{i = 1}^{N} \sum_{s = 1}^{S} \sum_{j = 1}^{K} \sum_{m = 1}^{M} m l_{i s j} x_{i s j m} \leq L$ (2)

$\sum_{j = 1}^{K} \sum_{m = 1}^{M} x_{i s j m} \leq 1, \forall i = 1 \dots N, \forall s = 1 \dots S,$ (3)

$\begin{array}{l} x_{i s j m} \in {0, 1}, \forall i = 1, \dots, N, \forall s = 1 \dots S \\ \forall j = 1, \dots, K, \forall m = 1 \dots M \end{array}$ (4)

The first term in the objective function (1) calculates the present value of the additional revenue generated from the increased transactions resulting from promotional campaigns. The additional transactions ( $t_{i s j m}$ ) are multiplied by the average transaction value ( $η_{i s j} a_{i s}$ ) and the profit margin ( $r$ ), then discounted to present value using d. The term is summed up over all customer groups, campaigns, and promotion durations, reflecting the overall increase in revenue due to the promotional efforts. The second term represents the total cost of proportional price promotions, which are discounts or incentives proportional to the purchase value. The cost $c_{j}$ is applied to the total expected number of transactions (both baseline $e_{i s m}$ and additional $t_{i s j m})$ and then discounted. The third term calculates the total cost incurred by fixed cost promotions, such as vouchers that have a constant cost per customer regardless of the transaction value. Each of the cost terms is also summed up over all applicable customer groups, campaigns, and durations. Constraint (2) ensures that the total number of person-hours required for applying all chosen promotional campaigns within the planning period does not exceed the available person-hours (L). Constraint (3) ensures that each customer group can receive no more than one promotional campaign for a duration of some months. Constraint (4) forces the decision variables to be binary, ensuring a clear yes/no decision on whether to apply a particular campaign.

Some properties of the problem:

$\begin{array}{l} Define p_{i s j m} \\ = {\begin{matrix} \begin{array}{l} t_{i s j m} \frac{η_{i s j} a_{i s}}{{(1 + d)}^{m}} r - c_{j} \frac{η_{i s j} a_{i s}}{{(1 + d)}^{m}} e_{i s m} \\ - c_{j} \frac{η_{i s j} a_{i s}}{{(1 + d)}^{m}} t_{i s j m}, \forall j = 1 \dots k_{1} \end{array} \\ t_{i s j m} \frac{η_{i s j} a_{i s}}{{(1 + d)}^{m}} r - f_{j} e_{i s m} - f_{j} t_{i s j m} \forall j = k_{1 + 1} .... K \end{matrix} \end{array}$

The objective function (1) can be reorganized into the format below.

$\sum_{m = 1}^{M} \sum_{i = 1}^{N} \sum_{s = 1}^{S} \sum_{j = 1}^{K} p_{i s j m} x_{i s j m}$ (5)

Property 1: For any customer category $i$ with any contribution level s and any campaign $j$ with any promotion length m, if $p_{i s j m} \leq 0$ , then $x_{i s j m} = 0$ in the optimal solution.

Proof: For any solution with $p_{i s j m} \leq 0$ and $x_{i s j m} = 1$ , changing $x_{i s j m}$ to 0 will give a better objective value. At the same time, the constraints will remain satisfied, which can be clearly observed from the objective function and constraints.

Property 2: For a problem without resource restriction, that is, there is no constraint (2), for any duration m, customer category $i$ with contribution level s, if there is at least one strategy $j$ satisfying $p_{i s j m} > 0$ , then $x_{i s j^{*} m} = 1$ in the optimal solution, where $j^{*} = \underset{1 \leq j \leq K}{\arg \max} {p_{i s j m}}$ .

Proof: Without (2), the problem can be decomposed into independent subproblems, one for each time duration m, customer category $i$ with contribution level s. For each subproblem, constraint (3) is that no more than one of the $x_{i s j m}' s$ can be 1. If there is at least one $j$ satisfying $p_{i s j m} > 0$ , then letting the $x_{i s j m}$ with the largest objective coefficient be 1 will clearly maximize the objective value.

Based on the above properties, a method is developed and used to solve the problem. This method is described in Appendix A (descriptions and Procedures 1 and 2).

Business Rule 2: Same Promotions for Customer Groups, with Varying Lengths of Time

Under this business rule, if any customer group is offered a promotion, all customer groups must be offered the same one. However, the promotions offered can be for different lengths of time.

Here, we have the same model as when using Business Rule 1 with an additional constraint (6).

$N S \sum_{m = 1}^{M} x_{i s j m} \geq \sum_{i^{'} = 1}^{N} \sum_{s^{'} = 1}^{S} \sum_{m = 1}^{M} x_{i^{'} s^{'} j m} \forall i, s, j$ (6)

The right-hand side (RHS) of a constraint (6) sums up all the $x$ variables for a promotion j. So, if the promotion is applied to any customer category $i$ with any contribution level s, the RHS will be positive but at most NS considering constraint (3); the constraint then forces $\sum_{m = 1}^{M} x_{i s j m}$ to be 1 for all $i$ and $s$ , that is, promotion j must be applied to all customers. The promotions given can be for different lengths of time.

Under Business Rule 2, the solution will be either not to apply a promotion to any customer or to apply the same promotion j to each customer group, for some time duration m, that is, $\sum_{m = 1}^{M} x_{i s j m} = 1$ . The problem can be solved using the same method employed for Business Rule 1, but Procedure 1 in the method needs to be replaced by a modified version. This modified procedure will be called “Procedure 3,” which is also presented in Appendix A. If the solution has a positive objective value, then the solution is optimal; otherwise, the optimal solution is not to apply a promotion to any customer.

Business Rule 3: Identical Promotions Offered to All Customer Groups

Under this business rule, if any customer group is offered a promotion, all customers are given it for the same length of time.

We have the same model as for Business Rule 1 with an additional constraint (7).

$\sum_{m = 1}^{M} m x_{i s j m} = \sum_{m = 1}^{M} m x_{i^{'} s^{'} j m'} \forall j, i, s, i^{'}, s^{'} a n d (i, s) \neq (i^{'}, s^{'})$ (7)

Under Business Rule 3, the solution will be either not to apply a promotion to any customer or to give every group the same promotion for the same length of time $m$ . The problem can be solved using the same method used for Business Rule 1, but Procedure 1 in the method needs to be replaced by Procedure 4, which is presented in Appendix A.

Improving Model Parameters: A Dynamic Learning Approach

To improve the model parameters dynamically, a RL approach (multi-armed bandit algorithm) is used. The model parameter $t_{i s j m}$ (i.e., the expected increase in the number of transactions by customers in category i with contribution level $s$ if campaign $j$ is applied for a period of $m$ months) can initially be estimated using various methods, such as survey data, historical data, or expert judgement. If promotional campaigns are applied to different customer groups, we can use the response to them to update the estimate of $t_{i s j m}$ using a RL approach—the multi-arm bandit learning algorithm.¹ The algorithm is presented as Procedure 5 in Appendix A.

The algorithm starts with the decision-maker initializing the values $t_{i s j m}$ for the first promotion period P = 1. Then it uses the optimization model based on the business rule chosen, for example, Rule 1, 2, or 3, to select the promotion strategy. The promotion strategy is decided for each customer group for the promotion period P. The decision-maker at the end of the promotion period P observes the actual number of transactions by each customer group when promotion j is applied for m months denoted as $n_{i s j m}^{p}$ and subtracts from this the expected number of transactions for the customer group without promotions ( $e_{i s m}$ ) to give the actual observed increase in transactions for customer category i with contribution level s for applying promotion j in the promotion period P denoted as ${t^{'}}_{i s j m}^{p}$ . The initialized value $t_{i s j m}$ is updated using the new information ${t^{'}}_{i s j m}^{p}$ . This process is repeated for each promotion period.

Two important elements in a learning algorithm are the exploration rate and the learning rate. Exploration is attempting decisions that are considered suboptimal to gather information that can enhance the learning algorithm’s policy over time, and exploitation exercises the best policy thus far to maximize the expected profit. It is important to balance the two: If there is no exploration, we can get trapped in a policy that is no longer optimal, but the more time we spend on learning the demand, the less time is spent exploiting promotions that optimize profit.

Another element is the learning rate, which determines how quickly the model parameters are updated when new data become available (Sutton and Barto 1998). If the way customers react to promotions changes dynamically, we value new information more and can choose a higher rate for the learning parameter to ensure our decision-making reflects this change. The advantage of using the learning approach is that we can learn the accurate value of $t_{i s j m}$ without explicitly understanding all the external factors that affect it, and since $t_{i s j m}$ is updated in real time, it reflects the changing customer buying behavior.

We incorporate the learning algorithm with the optimization model, which can update the initial estimates of parameters in the model based on real-time customer responses. The learning method can also be used to update the factor reflecting changes in contribution.

The Proposed Methodology Demonstrated on a Ferry Travel Agency

We demonstrate the effectiveness of the proposed framework on simulated scenarios for an online ferry travel agency that serves multiple ferry tour operators. To acquire customers, the agency creates awareness of the tour services by advertising through search engines such as Google and earns a commission on sales. Marketing activities, particularly those aimed at retention, can influence customer behavior and consequently the CLV and profitability (Gupta et al. 2006). Our focus is on retention, arguably the most imperative component of the CLV framework (Gupta, Lehmann, and Stuart 2004). Given the agency’s small profit margins, targeting the right customer with the right promotion is crucial to avoid profit losses, particularly when competition is fierce (Abrate et al. 2020).

Customer Groups and CLV

The first step in our framework is to create customer groups and calculating the CLV. Table 2 illustrates an example of the data, showing the subsequent purchases of the first-time customers “acquired” in January 2018. We aim to predict the likelihood of a customer remaining active (i.e., still purchasing from the firm) and their expected purchases in future months using a probabilistic model based on their past buying behavior. The ferry travel agency has two types of customers: frequent buyers, and holiday customers who purchase ferry trips only once a year. The purchases by the first customer type often occur at random, and their data show a very small number of trips on a weekly basis (.001%) and the majority on a monthly basis. Hence, it is more appropriate to model the number of transactions in each time period (i.e., monthly) as a Bernoulli process rather than a Poisson one. We apply the Beta-geometric/Beta-binomial (BG/BB) model, which is straightforward to implement and effective in forecasting customer behavior (Fader, Hardie, and Shang 2010). This model is suitable because it handles discrete transaction periods and is computationally less intensive than the Pareto/NBD model first developed by Schmittlein, Morrison, and Colombo (1987). Other CLV models, such as the RFM (recency, frequency, monetary value) model, could also be integrated into the proposed framework.

Table 2.

Monthly Buying Behavior by the January 2018 Cohort of First-Time Buyers.

ID	January	February	March	April	May	June	July
800000	1	0	1	0	1	0	1
800001	1	0	0	0	1	0	0
800002	1	0	0	0	1	0	1
800003	1	1	1	1	1	1	1
800004	1	0	1	0	0	0	0
800005	1	0	1	0	0	1	0
800006	1	0	0	0	0	0	1
800007	1	1	1	0	1	1	0
800008	1	0	0	0	0	0	0
. . .
823620	1	0	1	1	0	0	0
823621	1	0	0	0	0	0	0
823622	1	0	1	1	0	0	0
823623	1	0	1	1	0	1	1

To calculate CLV, we follow a two-step process (e.g., Fader, Hardie, and Lee 2005; Venkatesan and Kumar 2004). First, we forecast the future number of transactions using the BG/BB model. Then we estimate the individual average spend/profit per transaction using the gamma-gamma model (Fader, Hardie, and Lee 2005). CLV is then calculated as the future number of transactions multiplied by the profit per transaction, discounted, and summed over the planning horizon.

Each customer’s information is classified by recency and frequency of purchases. The notation used to represent this information is $(x, m, n),$ where $x$ is the number of transactions observed in the time period (0, T], $m$ (0 < $m$ ≤ T) is the time of the last transaction, and $n$ is the total number of repeat transaction opportunities in the period. Note that $x$ and $m$ are defined here just for this section and are different from the $x$ and $m$ used in the optimization model.

As previously mentioned, there are two types of customers, and their buying distribution behavior differs. If we put them together, it will appear that the yearly customers are no longer active with the company while they are between infrequent purchases. We focus on the more frequent buyers (the monthly buyers) because there is more opportunity to influence their behavior through promotions aimed at increasing their CLV. For the monthly customers, we consider 1 year of data in investigating the behavior of those who first purchased in January 2018. During this period, the firm did not offer any promotions. Six months (February–July) of data are used for fitting the model (to understand the customer buying behavior and generate the probability distribution) and calculate the CLV for the period August to December.² So here, T = 6 (July 2018). In the period (0, T), customers who first purchased in January had six transaction opportunities (n = 6). Table 3 shows an example of the distribution of these customers with different $(x, m)$ . With the six transaction opportunities, that is, from February to July, if a customer made four purchases and the last purchase was made in the fifth month (June), this would be classified as (4, 5, 6). Given this $(x, m, n)$ information about the customer’s purchasing behavior, the BG/BB model is used to calculate the expected number of purchases in the future. In January 2018, they had 386,781 new customers, and over the next 6 months, these customers made a total of 36,714 repeat transactions. Thus, the percentage of repeat transactions within the new customer cohort is about 9.5%, indicating that the customer retention rate is relatively low (details about the model can be found in Appendix B). The maximum-likelihood estimates produce the model parameter values (α = .112, β = 2.821, γ = .296, δ = 1.542, and a log likelihood of −243,143).

Table 3.

Example of Frequency/Recency Summary of the Monthly Purchase Behavior of the 2018 January Cohort of First-Time Buyers.

x	m	Number of Customers	x	m	Number of Customers
6	6	200	4	4	158
5	6	400	3	4	872
4	6	537	2	4	2,414
3	6	1,435	1	4	3,000
2	6	2,922	3	3	356
1	6	3,305	2	3	2,283
5	5	250	1	3	3,224
4	5	300	2	2	1,104
3	5	1,176	1	2	3,000
2	5	2,074	1	1	5,292
1	5	2,412	0	0	350,067

Table 4 depicts the relationship between frequency and recency and the expected number of transactions. The expected number of transactions in the forecast period increases, as the last time of purchase is more recent. This shows that the longer the interval between making purchases, the more likely it is that the customer is no longer active. The conditional expectation is also an increasing function of the number of repeat transactions in the 6-month calibration period. A customer who has made a transaction every month from February to July is expected to make 3.10 transactions over the next 5 months, whereas one who has not made a purchase since making their initial transaction is expected to make only .03 repeat transactions over that period. However, this category of customers makes up 90.5% of the entire cohort. Taken together, and critically, the customers of this category, despite the low expected number of transactions for each one of them, are expected to make over 10,502 transactions in total during the next 5 months. This makes them collectively more valuable than all the other frequency/recency categories, as can be seen from Tables 3 and 4.

Table 4.

Expected Number of Repeat Transactions from August to December as a Function of Frequency and Recency.

Number of Repeated Transactions February to July	Month of Last Transaction from January to July
Number of Repeated Transactions February to July	January	February	March	April	May	June	July
0	.03
1		.29	.38	.45	.5	.54	.56
2			.56	.77	.91	1.01	1.07
3				.93	1.27	1.47	1.58
4					1.51	1.91	2.08
5						2.3	2.59
6							3.1

To calculate CLV, we multiply the expected number of transactions calculated by the BG/BB model from Table 4 by the discounted average expected contribution over the planning horizon (in our case, i.e., 5 months). Note: We use contribution rather than the spending on a future purchase for each customer because the agency only receives a percentage, that is, a commission rate of the actual spending. We use the gamma-gamma model (Fader, Hardie, and Lee 2005) to predict the expected contribution for each customer.

An example of the differences in distribution for different customers is presented in Figure 3 (each row/line is a different customer). We can aggregate the customers’ contributions into three levels (i.e., £15, £22, and £40) as the data clustered largely into these groups. We refer to the customers in these groups as low-, medium-, and high-contribution customers to facilitate comparing the effects of promotional campaigns for different levels of contribution.

Figure 3.

Distribution of transaction contributions of individual customer.

The number of future transactions may be increased by targeting promotional campaigns. To gain an initial understanding of the effects of different promotional campaigns, the approach utilizes information obtained through engaging directly with customers, and enrolling them as active players in the creation of the promotional approach to be used by the firms. For this purpose, an online survey was given to a sample of the ferry travel agency’s customer base.

Initial Estimation of Promotional Campaign Effectiveness

Initially, the parameters in the resource allocation model are estimated using information from survey data. A short online questionnaire was developed for the survey in order to gain insight into customer preferences for promotional campaigns and to better understand the demand for ferry trips. For one week, the survey was available to all customers who visited the company’s website. Customers were asked to fill out a questionnaire using the Google survey shown in Appendix C. Convenience sampling was used, and a response rate of 3.3% was obtained, giving a sample of 3,000 questionnaires to be analyzed, and while there is likely to be some response bias, this is deemed to be nonproblematic. First, the primary purpose of the current study is to demonstrate the initial effectiveness of our methodology, rather than to provide definitive conclusions about the population. Second, the survey data provide a starting point for the RL approach to begin its learning process, and so self-correction is an inherent aspect of the methodology.

Thus, to assess which promotional campaign would be most effective and to understand the potential to increase demand (e.g., expected increase in the number of trips by applying a particular promotion), we classify customers into 19 categories according to their different frequency/recency profiles and examine the customer categories against their preference for promotion. There are very few customers with a frequency of purchase greater than or equal to 6, hence we combine the categories of 5 and 6+ and create a category 5+. Figure 4 provides a visual representation of promotion preferences across different customer categories. The size of each circle represents the proportion of responses for each promotional strategy within each category, scaled to account for the relative size of the category.

Figure 4.

Promotion preference for each frequency/recency customer profile.

In the frequency/recency column, the first number represents the frequency of purchases, and the second indicates the last time the customer bought something. For example, 1/1–3 means those who have purchased once in the last 6 months, and the last purchase was made between 1 and 3 months ago. The upper, median, and lower sample group sizes were 320, 141, and 40, respectively. The normalized means and standard deviations of preferring different promotional campaigns are presented in Table 5.

Table 5.

Summary Statistics of Customer Preference.

Promotion Type	Normalized Mean (%)	Standard Deviation (%)
Loyalty	65.6	22.7
Cancellation	19.9	17.3
Insurance	6	9.2
Vouchers	8.6	8.6

Examining Figure 3 and Table 5, it is evident that loyalty discounts are the most widely preferred promotional campaign, as indicated by the significantly larger circles in this column and a high normalized mean of 65.6% and a standard deviation of 22.7%. The standard deviation shows a noticeable variability in preferences among different customer categories.

To validate these observations, we conducted an analysis of variance (ANOVA) test on the data to assess whether or not the mean scores for each promotion were statistically the same. The ANOVA test (F-value = 2449.13 and F-critical = 3.79) rejects the null hypothesis at the 1% significance level, hence at least two of the mean scores of the promotions are different. The results of the Tukey post hoc test indicate that when customers are asked “What would make you more likely to book with this company today?” the loyalty discount program option was significantly preferred over all the other options, and the other options are not significantly different from each other in terms of preference. For customers who have not purchased from the company, the loyalty discount program, free cancellation product, and free travel insurance are equally preferred. We can estimate the share of wallet and potential demand using the information on the total number of trips made and all of the trips made using “this company.” This is presented in Figure 5. For example, a customer who has purchased three times in the last 6 months with the last purchase being 3 to 6 months ago could have made an additional purchase of .8 trips. The survey data are used to provide two further pieces of data. First, they identify a lower share of wallet customers who are sharing their wallet with other companies. The lifetime basket of goods for these lower-share-of-wallet customers has the potential to increase if the firm can raise its share of wallet. Second, the survey allows the firm to identify the promotions most valued by different customer groups, and these values can be used as initial estimates in the optimization model.

Figure 5.

Total demand for ferry trips.

Results

Parameter Values for the Optimization Model

We provide details of the parameter values used in the optimization model; the parameter values $e_{i s m}$ and $t_{i s j m}$ are estimated using Table 4 and the survey data, respectively. To ease the difficulty involved in responding to the survey and thus maximize the response rate (Nulty 2008), rather than collecting monthly data, respondents provided data on grouped monthly periods (i.e., Questionnaire, Q3). To integrate information from the two sources (the CLV for the customer groups and the survey data), the customer profiles were aligned (as shown in Appendix D, Table D1). We consider three contribution levels, that is, low, medium, and high, where the average contributions ( $a_{i s}$ ) are set to 100, 146.66, and 266.66, and the number of customers in each group is shown in Table D2. The profit margin here is the commission, and this is set at 15% per purchase. In the duration of the promotional offer m, the promotion is assumed to be equally effective for each month. We illustrate the impact of these naive assumptions on the company’s revenue in a later subsection.

The optimization model only considers four types of promotions (j = 4), namely loyalty discounts, free cancellation, free insurance, and free shopping vouchers, as the survey results showed that other options were of little interest. The parameter values used to estimate the cost of free cancellation and travel insurance promotions reflect the company’s current experience: 3% of customers tend to cancel and receive a full refund, the average travel insurance cost for a ferry trip is £5, and a minimum of £4 is required as a food voucher. The amount of loyalty discount offered varies (i.e., 5%, 7%, and 9%) in assessing its effect on the chosen promotional campaign strategies for different customer groups. The person-hours requirement, set as a constraint in the resource allocation model (constraint [2]), is low in our case because the process can be automated, and customer service is only contacted to reissue a code when a voucher is void. The discount factor is set to .03.

The resource allocation model is run on the estimated parameters derived from the survey results for 14 customer types grouped by frequency and recency categories ( $N =$ 14) and three different contribution levels ( $S$ = 3) for three business rules.

Figure 6 illustrates the profit differences among the three rules, aiding the company in assessing the variations between them. When profit differences are small, companies might opt to make slightly less profit in favor of improving customer relationships.

Figure 6.

Profit difference between business rules.

The model results indicate that promotional discounts are considered the most effective campaign. Under Business Rule 3, the optimal strategy is to offer a 7% discount to all customer groups for 5 months. Figure 7 draws a comparison with the 7% discount for the low-contribution group, highlighting the difference between expected and promotional profit.

Figure 7.

Expected and promotional (7% for 5 months) profits for the low-contribution level.

When a gray bar is shorter than a black one, the promotional campaign results in a loss for that customer group. This means that the cost of the discounts exceeds the revenue generated from the increase in sales due to the promotion, which is the case for customer categories characterized by high frequency and recency (e.g., 5/up to 1 month, 4/1–3 months, 3/up to 1 month). The customer categories with the largest positive difference between promotional and expected sales profit are those with a lower frequency and when purchases were made 3 to 6 months ago (e.g., 1/6 months ago, 2/3–6 months, 3/3–6 months). Under Business Rule 1, promotions would only be offered to profitable customer groups. The survey results show that these groups exhibit demand for the service but share their wallet with other ferry travel agencies or operators. Business Rule 2 proposes an optimal campaign to offer a 7% discount for 1 month to all nonprofitable customer groups (i.e., those whose expected profit is greater without a promotional campaign) and a 7% discount for 5 months to all remaining customer groups. We consider another business rule where if one customer group receives a promotion, all customer groups must receive it as well. However, the promotion can vary for each group and be offered for different lengths of time. Figure 8 shows the optimal promotions for each customer category and contribution level.

Figure 8.

Promotional campaigns for each contribution level.

The results emphasize the importance of utilizing marketing metrics like CLV together with knowledge of share of wallet when considering promotional campaigns. Intuitively, managers may believe that customers who purchase more frequently should be the primary targets for loyalty discounts, especially when focusing solely on their purchase amounts; however, our findings suggest that this intuition may not be accurate in this case.

We benchmarked our model against that of von Mutius and Huchzermeier (2021) using our data. We simplified our model, specifically by fixing the duration of the promotion and not including the dynamic learning mechanism, to match their model assumptions. The results demonstrated that the models performed comparably (an example of this is shown in Appendix E), which shows that our model, despite its added adaptability, maintains performance under more straightforward conditions. However, by including detailed timing for promotions and dynamic learning, our model significantly enhances its optimization capabilities. Detailed timing enables tailored promotional durations based on customer behavior, leading to more precise and effective campaign planning as illustrated when comparing Business Rules 2 and 1. The dynamic learning mechanism is crucial, especially since survey results can quickly become outdated. To avoid customer switching and loss in the long term, the parameters in the model must be updated dynamically. Next, we illustrate the benefits of using a learning approach when the initial assumptions are inaccurate.

Benefits of a Learning Approach

Our approach incorporates a RL algorithm that continually updates its parameters based on incoming data. This adaptive mechanism is designed to adjust the parameters over time, allowing them to converge towards their accurate values as more data becomes available. This aspect of our methodology helps mitigate some of the initial biases and inaccuracies in the survey data. For example, if we assume that promotions are equally effective throughout the entire promotion period, the $t_{i s j m}$ for a 7% discount as presented in Figure 9a are the initial values. However, if this assumption is incorrect and the actual $t_{i s j m}$ are the same as the new values presented in Figure 9b, we would lose out on revenue from exercising a nonoptimal policy.

Figure 9.

Heat map showing the difference in t_isjm values. (a) Initial values. (b) New values.

Figure 10 shows the optimal promotional offers with the initial and new $t_{i s j m}$ values under Business Rule 2 for the low-contribution group. The optimal promotional offers for some customer groups differ under the two sets of values—for example, with the initial values, the optimal time to apply the promotion of a 7% discount on future purchases is 5 months for the group that has made two purchases in the last 6 months and the last purchase was made 3 to 6 months ago (2/3–6). However, with the new values, for the same group (2/3–6), the 7% promotional offer should only be applied for 2 months. The profit differs between the two assumptions by 1.7%, resulting in a profit difference of £11,894.39.

Figure 10.

Under business rule 2 comparison of the optimal time (in months) for promotions of 7% discount for the different values of t_isjm.

This is an example of one naïve assumption. If other unobservable variables are not included in the model, it is likely that we are far from the true optimal policy. This is particularly important for companies that have small profit margins, such as the ferry travel agency. The advantage of using a learning algorithm is that we can learn and update our policy without explicitly observing all variables, which is useful, because even if we start with the wrong assumption, we can learn the correct policy over time. We initially assumed that promotions were equally effective across each of the months in the promotion period (initial $t_{i s j m}$ values); however, the data actually follow the new $t_{i s j m}$ . Through exploitation (applying promotional offers that we know yield high profit) and exploration (trying promotional offers that are nonoptimal), the learning algorithm learns the true optimal policy.³ In other words, the RL approach is designed to offer customers opportunities to experience promotions that align with neither historic behaviors nor past expressed preferences (through the survey), with the goal of providing customers with learning opportunities regarding current preferences for promotions. The optimal policy is updated each time new information becomes available, and over time, it converges to the true maximum profit, as illustrated in Figure 11, which illustrates the evolution of the optimal profit under different learning rates.⁴ If we expect customer behavior to change frequently, we will choose a higher learning rate to adapt more quickly to these changes. Conversely, a lower learning rate would be more suitable for stable environments, where customer behavior is consistent over time.

Figure 11.

Learning curve over time with various learning parameters under business rule 2.

When the learning algorithm can derive accurate estimates of the parameters, the question could be raised as to why we should bother engaging with customers to make them active participants in the value creation process (in our case, through the application of a survey). The answer is that without survey data, we would not have even a rough idea of how different promotions affect the demand for different customer groups. If we had followed our intuition and applied promotions to the most profitable customers for 5 months, we would have made a loss of £11,948.03. We may then change our strategy to apply promotions to those who last purchased 6 months ago. With this poor initial setting, it can take years to arrive at an optimal strategy, without understanding the root cause of ineffective promotions. Survey data help us initialize the parameters to sensible starting decisions.

Discussion

We present a dynamic learning framework for optimizing promotional campaigns to maximize CLV, using customer behavioral responses—an approach embedding value co-creation. This study contributes to literature on CLV and value co-creation, both theoretical and practical.

Theoretical Contribution

We expand the features of CKV (Kumar et al. 2010) to include implicit feedback from digital interactions (i.e., customer real-time behavioral responses to marketing campaigns). This reduces reliance on direct feedback mechanisms (such as survey data) and offers real-time adaptation through analyzing interaction patterns. We develop a dynamic CLV optimization model, which includes real-world constraints such as limited resources and can be adapted to different business rules, making the model more generalizable to different industries. Accordingly, we demonstrate how analytics and data-driven approaches can be used to inform CRM strategy, and specifically, we show how RL and optimization can be employed to incorporate value co-creation to drive customer-centric marketing decisions. Furthermore, we operationalize a dynamic value co-creation method, where customers are seen as an integral part of the process, not just receivers of the service. Our findings suggest that such co-creation processes can lead to more personalized and effective service strategies, ultimately contributing to an increase in CLV.

Managerial Implications

Managers tend to target more profitable customers with promotions in resource-constrained environments. However, our findings show that these customers may already spend all of their share of wallet with the company, and hence offering them discounts can reduce profits without increasing CLV. Rather than basing decisions on historical profitability alone, managers need to assess the incremental value potential. We found that mid-tier customers had great potential to increase profit if targeted appropriately by matching them with the right promotion and duration.

The proposed framework offers a practical and scalable method for adapting promotional strategies to changing customer behavior. In contrast, static models use historical data and can suffer if they are based on incorrect assumptions (e.g., promotions are equally effective across the whole duration of the promotion period). Thus, the RL approach allows firms to continuously improve the relevance and effectiveness of their marketing efforts. For example, in an ongoing process, the firm can empower customers to engage proactively with it in creating value. For instance, periodically the firm can generate customer insights (e.g., through focus groups) to create a picture of customers’ evolving needs and wants and can thereby generate an array of new potential promotional offers. The latter can be included in a survey where customers provide feedback on their preferences regarding these emergent ideas, and the information emerging from the survey can help the firm choose new promotions to enter into the dynamic learning model.

While this methodology can be applied to other sectors such as e-commerce and retail, it is especially effective in digital service settings like online travel agencies where customer engagement is critical. Here, switching between providers is effortless, and customer preferences shift rapidly, making it essential to continuously adapt and retain users. Furthermore, we have developed a practical solution approach (Appendix A) that addresses optimization challenges when the dimensionality of the problem increases, that is, the number of groups and promotions grows. Our model aligns with recognized growing needs in the service context (e.g., chatbots, recommendation engines), offering a blueprint for decision-making that is both data-driven and customer-centered.

Limitations and Future Research

As with all modeling approaches, this study has limitations that can lead to new interesting possibilities for future research. One of these is to relax the constraint in the model of the exclusive campaign application and allow customers to receive more than one promotional offer and measure its impact on CLV.

In the current framework, we have fixed customer segments over the learning phase. A benefit of this approach (i.e., assuming that segments are static) is that we can generate an understanding of the effect of promotions across different customer groups over time and so can isolate the impact of promotions on changes in customer behavior. Specifically, it allows the process of learning about responses to promotions to stabilize. Nevertheless, a drawback is that the approach entails not accommodating changes in customer segments, which may be less effective, particularly under high levels of customer dynamism. Accordingly, future work would benefit from exploring hybrid models that explicitly incorporate the ability to adjust to evolving customer segments. For example, such models could alternate between periods of maintaining segments structures for learning stability and that of revisiting segment composition.

It is also worth noting that the three business rules used in the empirical example demonstrate a diverse range of strategies that the firm may choose to follow when refining the model for use in-house. It should be noted that we make no claims as to whether the specific features of the rules are appropriate for any individual business; rather, it will be necessary for businesses to determine which rules they deem to be most fitting for their own business situation. Fortunately, a benefit of the approach is that the rules can be quite easily adapted and tweaked to the strategic preferences of the firm.

Another interesting area to explore is a model that considers strategic customer behavior. Researchers may consider integrating game-theory approaches to take into account customers who adjust their buying behaviors in anticipation of promotions.

Selecting the channel for promotions could also be built into the optimization model, so that the framework could tailor the type of promotion, duration, and its optimal delivery channel. Finally, a longitudinal study would help in assessing the long-term impact of RL-driven strategies on customer engagement, loyalty, and business performance, which would provide further evidence of the effectiveness of these approaches.

Supplemental Material

sj-docx-1-jsr-10.1177_10946705251365524 – Supplemental material for Optimizing Promotional Campaigns to Maximize Customer Lifetime Value: A Dynamic Learning Approach

Supplemental material, sj-docx-1-jsr-10.1177_10946705251365524 for Optimizing Promotional Campaigns to Maximize Customer Lifetime Value: A Dynamic Learning Approach by Rupal Mandania, John Cadogan, Jiyin Liu and Nayyar Kazmi in Journal of Service Research

Supplemental Material

sj-docx-2-jsr-10.1177_10946705251365524 – Supplemental material for Optimizing Promotional Campaigns to Maximize Customer Lifetime Value: A Dynamic Learning Approach

Supplemental material, sj-docx-2-jsr-10.1177_10946705251365524 for Optimizing Promotional Campaigns to Maximize Customer Lifetime Value: A Dynamic Learning Approach by Rupal Mandania, John Cadogan, Jiyin Liu and Nayyar Kazmi in Journal of Service Research

Footnotes

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Funding

The authors received no financial support for the research,authorship,and/or publication of this article.

ORCID iD

Rupal Mandania

Supplemental Material

Supplemental material for this article is available online.

Authors Biographies

Rupal Mandania is a Senior Lecturer in Management Science at Loughborough University. She holds degrees in Mathematics,Statistics,and a PhD in Operational Research. Her research applies operations research methods to challenges in data analytics,pricing,marketing,scheduling,and optimization,focusing on the airline,telecommunications,and tourism industries.

John Cadogan is Professor of Marketing at the University of Leicester Business School,Honorary Professor at LUT University,and Visiting Professor at the University of Eastern Finland and East China Normal University. He is Editor-in-Chief of International Marketing Review . His research interests include marketing strategy,international marketing,and sales.

Jiyin Liu is Professor of Operations Management at Loughborough Business School,Loughborough University. He holds a PhD from the University of Nottingham and degrees from Northeastern University,China. His research focuses on planning,scheduling,and solving practical operations problems. He has published in top journals including EJOR,OR,and TRB.

Nayyar Kazmi is a Senior Data Scientist with expertise in optimization,machine learning,AI forecasting,and business analytics. She has worked across sectors including marketing,pricing,telecoms,energy,and agriculture. She holds a PhD in Operational Research and master’s degree in Software Engineering and Computer Science.

References

Abrate

Graziano

Bruno

Clementina

Erbetta

Fabrizio

Fraquelli

Giovanni

(2020), “Which Future for Traditional Travel Agencies? A Dynamic Capabilities Approach,” Journal of Travel Research, 59 (5), 777–91.

Ad Age Studio 30 (2023), “Why Co-Creation Future Marketing Strategies,” (accessed June, 2024), (available at: https://adage.com/article/special-report-cannes-lions/why-co-creation-future-marketing-strategies/2503476).

Ascarza

Eva

Neslin

Scott A.

Netzer

Oded

Anderson

Zachery

Fader

Peter S.

Gupta

Sunil

Hardie

Bruce G. S.

Lemmens

Aurélie

Libai

Barak

Neal

David

Provost

Foster

Schrift

Rom

(2018), “In Pursuit of Enhanced Customer Retention Management: Review, Key Issues, and Future Directions,” Customer Needs and Solutions, 5 (1), 65–81.

Bell

David

Deighton

John

Reinartz

Werner J.

Rust

Roland T.

Swartz

Gordon

(2002), “Seven Barriers to Customer Equity Management,” Journal of Service Research, 5 (1), 77–85.

Berger

Paul D.

Bechwati

Nada N.

(2001), “The Allocation of Promotion Budget to Maximize Customer Equity,” Omega, 29 (1), 49–61.

Berger

Paul D.

Eechambadi

Naras

George

Morris

Lehmann

Donald R.

Rizley

Ross

Venkatesan

Rajkumar

(2006), “From Customer Lifetime Value to Shareholder Value: Theory, Empirical Evidence, and Issues for Future Research,” Journal of Service Research, 9 (2), 156–67.

Bijmolt

Tammo H. A.

Leeflang

Peter S. H.

Block

Frank

Eisenbeiss

Maik

Hardie

Bruce G. S.

Lemmens

Aurélie

Saffert

Peter

(2010), “Analytics for Customer Engagement,” Journal of Service Research, 13 (3), 341–56.

Chandler

Jennifer D.

Vargo

Stephen L.

(2011), “Contextualization and Value-in-Context: How Context Frames Exchange,” Marketing Theory, 11 (1), 35–49.

Cossío-Silva

Francisco-José

Revilla-Camacho

María-Ángeles

Vega-Vázquez

Manuela

Palacios-Florencio

Beatriz

(2016), “Value Co-creation and Customer Loyalty,” Journal of Business Research, 69 (5), 1621–25.

10.

Dew

Ryan

Ascarza

Eva

Netzer

Oded

Sicherman

Nachum

(2024), “Detecting Routines: Applications to Ridesharing Customer Relationship Management,” Journal of Marketing Research, 61 (2), 368–92.

11.

Drèze

Xavier

Bonfrer

André

(2009), “Moving from Customer Lifetime Value to Customer Equity,” Quantitative Marketing & Economics, 7 (3), 289–320.

12.

Rex Y.

Kamakura

Wagner A.

Mela

Carl F.

(2007), “Size and Share of Customer Wallet,” Journal of Marketing, 71, 94–113.

13.

Fader

Peter S.

Hardie

Bruce G. S.

Shang

Jen

(2010), “Customer-Base Analysis in a Discrete-Time Non-Contractual Setting,” Marketing Science, 29 (6), 1086–108.

14.

Fader

Peter S.

Hardie

Bruce G. S.

Lee

Ka L.

(2005), “RFM and CLV: Using Iso-Value Curves for Customer Base Analysis,” Journal of Marketing Research, 42, 415–30.

15.

Gupta

Sunil

Hanssens

Dominique

Hardie

Bruce

Kahn

William

Kumar

Lin

Nathaniel

Ravishanker

Nalini

Sriram

(2006), “Modeling Customer Lifetime Value,” Journal of Service Research, 9 (2), 139–55.

16.

Gupta

Sunil

Lehmann

Donald R.

Stuart

Jennifer A.

(2004), “Valuing Customers,” Journal of Marketing Research, 41 (1), 7–18.

17.

Hosseini

Monireh

Shajari

Sepideh

Akbarabadi

Mina

(2022), “Identifying Multi-Channel Value Co-Creator Groups in the Banking Industry,” Journal of Retailing and Consumer Services, 65, 102312.

18.

Kanchanapoom

Kessara

Chongwatpol

Jongsawas

(2023), “Integrated Customer Lifetime Value (CLV) and Customer Migration Model to Improve Customer Segmentation,” Journal of Marketing Analytics, 11 (2), 172–85.

19.

Karvanen

Juha

Rantanen

Ari

Luoma

Lasse

(2014), “Survey Data and Bayesian Analysis: A Cost-Efficient Way to Estimate Customer Equity,” Quantitative Marketing & Economics, 12 (3), 305–29.

20.

Kim

Yumi P.

Boo

Soyoung

Hailin

(2018), “Calculating Tourists’ Customer Equity and Maximizing the Hotel’s ROI,” Tourism Management, 69, 408–21.

21.

Kumar

Venkatesan

Rajkumar

Bohling

Tim

Beckmann

Denise

(2008), “Practice Prize Report—The Power of CLV: Managing Customer Lifetime Value at IBM,” Marketing Science, 27 (4), 585–99.

22.

Kumar

Aksoy

Lerzan

Donkers

Bas

Venkatesan

Rajkumar

Wiesel

Thorsten

Tillmanns

Sebastian

(2010), “Undervalued or Overvalued Customers: Capturing Total Customer Engagement Value,” Journal of Service Research, 13 (3), 297–310.

23.

Libai

Barak

Bart

Yakov

Gesler

Sonja

Hofacker

Charles F.

Kaplan

Andreas

Kötterheinrich

Kim

Kroll

Eike B.

(2020), “Brave New World? On AI and the Management of Customer Relationships,” Journal of Interactive Marketing, 51 (1), 44–56.

24.

Line

Nathaniel D.

Dogru

Tarik

El-Manstrly

Dahlia

Buoye

Alex

Malthouse

Kandampully

Jay

(2020), “Control, Use and Ownership of Big Data: A Reciprocal View of Customer Big Data Value in the Hospitality and Tourism Industry,” Tourism Management, 80, 104106.

25.

Manosuthi

Noppadol

Lee

Jin-Soo

Han

Heesup

(2021), “Causal-Predictive Model of Customer Lifetime/Influence Value: Mediating Roles of Memorable Experiences and Customer Engagement in Hotels and Airlines,” Journal of Travel & Tourism Marketing, 38 (5), 461–77.

26.

McManus

Lisa

. (2013), “Customer Accounting and Marketing Performance Measures in the Hotel Industry: Evidence from Australia,” International Journal of Hospitality Management, 33, 140–52.

27.

Meire

Matthijs

Hewett

Kelly

Ballings

Michel

Kumar

Van den Poel

Dirk

(2019), “The Role of Marketer-Generated Content in Customer Engagement Marketing,” Journal of Marketing, 83 (6), 21–42.

28.

Moeller

Sabine

Ciuchita

Robert

Mahr

Dominik

Odekerken-Schröder

Gaby

Fassnacht

Martin

(2013), “Uncovering Collaborative Value Creation Patterns and Establishing Corresponding Customer Roles,” Journal of Service Research, 16 (4), 471–87.

29.

Ngai

Eric W. T.

Yuanyuan

(2022), “Machine Learning in Marketing: A Literature Review, Conceptual Framework, and Research Agenda,” Journal of Business Research, 145, 35–48.

30.

Nulty

Duncan D.

(2008), “The Adequacy of Response Rates to Online and Paper Surveys: What Can Be Done?” Assessment & Evaluation in Higher Education, 33 (3), 301–14.

31.

Ovchinnikov

Anton

Boulu-Reshef

Béatrice

Pfeifer

Phillip E.

(2014), “Balancing Acquisition and Retention Spending for Firms with Limited Capacity,” Management Science, 60 (8), 2002–19.

32.

Peña-García

Nathalie

Losada-Otálora

Mauricio

Juliao-Rossi

Jorge

(2022), “What Type of Client Do You Need? The Brand Value Co-Creation in the Banking Sector,” Frontiers in Psychology, 13, 988985.

33.

Petersen

Andrew J.

McAlister

Leigh

Reibstein

David J.

Winer

Russell S.

Kumar

Atkinson

Geoff

(2009), “Choosing the Right Metrics to Maximize Profitability and Shareholder Value,” Journal of Retailing, 85 (1), 95–111.

34.

Prahalad

C. K.

Ramaswamy

Venkat

(2004), “Co-Creation Experiences: The Next Practice in Value Creation,” Journal of Interactive Marketing, 18 (3), 5–14.

35.

Ram

Jiwat

Zhang

Zeyang

(2022), “Examining the Needs to Adopt Big Data Analytics in B2B Organizations: Development of Propositions and Model of Needs,” Journal of Business & Industrial Marketing, 37 (4), 790–809.

36.

Reinartz

Werner

Kumar

(2000), “On the Profitability of Long-Life Customers in a Noncontractual Setting: An Empirical Investigation and Implications for Marketing,” Journal of Marketing, 64, 17–35.

37.

Reinartz

Werner

Kumar

(2003), “The Impact of Customer Relationship Characteristics on Profitable Lifetime Duration,” Journal of Marketing, 61 (1), 77–99.

38.

Reinartz

Werner

Thomas

Jacquelyn S.

Kumar

(2005), “Balancing Acquisition and Retention Resources to Maximize Customer Profitability,” Journal of Marketing, 69, 63–79.

39.

Rust

Roland T.

Kumar

Venkatesan

Rajkumar

(2011), “Will the Frog Change into a Prince? Predicting Future Customer Profitability,” International Journal of Research in Marketing, 28 (4), 281–94.

40.

Salisbury

Linda C.

Nenkov

Gergana Y.

Blanchard

Simon J.

Hill

Ronald P.

Brown

Alexander L.

Martin

Kelly D.

(2023), “Beyond Income: Dynamic Consumer Financial Vulnerability,” Journal of Marketing, 87 (5), 657–78.

41.

Schmittlein

David C.

Morrison

Donald G.

Colombo

Richard

(1987), “Counting Your Customers: Who Are They and What Will They Do Next?” Management Science, 33 (1), 1–24.

42.

Stakhovych

Stanislav

Tamaddoni

Ali

(2020), “Mix&Match: A Resource-Based Complaint Recovery Framework for Tangible Compensation,” Journal of Service Research, 23 (3), 337–52.

43.

Sutton

Richard

Barto

Andrew G.

(1998), Reinforcement Learning. Cambridge, Massachusetts: The MIT Press.

44.

Talón-Ballestero

Pilar

Nieto-García

Marta

Gonzáles-Serrano

Lydia

(2022), “The Wheel of Dynamic Pricing: Towards Open Pricing and One to One Pricing in Hotel Revenue Management,” International Journal of Hospitality Management, 102, 103184.

45.

Terblanche

Nic S.

(2015), “Customers’ Perceived Benefits of a Frequent-Flyer Program,” Journal of Travel & Tourism Marketing, 32 (3), 199–210.

46.

Tuunanen

Tuure

Lumivalo

Juuli

Vartiainen

Tero

Zhang

Yixin

Myers

Michael D.

(2024), “Micro-Level Mechanisms to Support Value Co-Creation for Design of Digital Services,” Journal of Service Research, 27 (3), 381–96.

47.

Vargo

Stephen L.

Lusch

Robert

(2008), “Service-Dominant Logic: Continuing the Evolution,” Journal of the Academy of Marketing Science, 36 (1), 1–10.

48.

Venkatesan

Rajkumar

Kumar

(2004), “A Customer Lifetime Value Framework for Customer Selection and Resource Allocation Strategy,” Journal of Marketing, 68, 106–25.

49.

von Mutius

Bernhard

Huchzermeier

Arnd

(2021), “Customized Targeting Strategies for Category Coupons to Maximize CLV and Minimize Cost,” Journal of Retailing, 97 (4), 764–79.

50.

Yoo

Michelle

Bai

Billy

Singh

A. K.

(2020), “The Evolution of Behavioral Loyalty and Customer Lifetime Value Over Time: Investigation from a Casino Loyalty Program,” Journal of Marketing Analytics, 8 (2), 45–56.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.05 MB

0.01 MB