Abstract
Introduction
Promotional campaigns are an important component of marketing strategies, and by targeting specific customer segments with tailored promotions (e.g., discounts, coupons, loyalty programs), businesses can optimize the effectiveness of their marketing (von Mutius and Huchzermeier 2021). However, measuring the long-term effectiveness of firms’ marketing activities (e.g., promotional campaigns) and developing responsive dynamic marketing strategies can be challenging, especially without complete information on customers’ buying behavior (e.g., on the company’s share of customers’ wallets). Many firms, such as the airline Emirates, have access to a rich body of information on their customers’ purchasing behaviors, including the recency and frequency of purchases, and the user journey on the company website and mobile app (e.g., Line et al. 2020); however, obtaining information on customers’ transactions with competing firms, such as United Airlines, is difficult. This limitation forces most companies to rely on an inward view when managing customer relationships, making it challenging to target promotions and other marketing efforts to achieve the best return on investment (Du, Kamakura, and Mela 2007). For example, when resources are limited, firms may choose to target highly profitable customers with promotions (Reinartz and Kumar 2000); yet, if some of those customers already spend their entire wallet with the firm, promotional campaigns such as discounts may have no effect on increasing their expenditure, and revenue is lost on their customary sales. Thus, it is telling that the literature notes that firms tend to overspend on retaining high-value customers and underspend on lower-value ones (Ovchinnikov, Boulu-Reshef, and Pfeifer 2014).
Encouraging a collaborative mechanism between the company and its customers, formally known as the concept of “value co-creation,” can address some of these challenges (Prahalad and Ramaswamy 2004): It can help the company understand its customers’ needs and preferences and align its services accordingly, and marketers are increasingly recognizing co-creation for its important role in marketing strategy (Ad Age Studio 30 2023).
Enhancing individual customers’ lifetime value (CLV) remains a challenge in marketing, especially when key variables such as share of wallet are not directly observable. Studies engaging in and encouraging CLV-oriented research efforts abound in a diverse range of marketing fields, including those focusing on the management of customer equity (Bell et al. 2002; Drèze and Bonfrer 2009; Karvanen, Rantanen, and Luoma 2014; Kim, Boo, and Qu 2018), the quality of the company–client relationship and its recovery (Stakhovych and Tamaddoni 2020), customer relationship management (Berger et al. 2006; Dew et al. 2024; Libai et al. 2020; Salisbury et al. 2023), big data issues (Line et al. 2020; Ram and Zhang 2022), customer segmentation (Kanchanapoom and Chongwatpol 2023), customer engagement (Bijmolt et al. 2010; Manosuthi, Lee, and Han 2021; Meire et al. 2019), loyalty and loyalty programs (Ascarza et al. 2018; Terblanche 2015; Yoo, Bai, and Singh 2020), marketing strategy (McManus 2013), and pricing strategies (Talón-Ballestero, Nieto-García, and Gonzáles-Serrano 2022), among others.
However, much of the existing research on CLV still treats customers as receivers, rather than co-creators, of services. One of the reasons for this gap may lie in how firms approach the issue of customer knowledge, a key factor underpinning firms’ engagement with customers (Kumar et al. 2010). Specifically, there may be a need to expand the traditional notion of how firms define and apply feedback from customers to add value to the firm—what Kumar et al. (2010, p. 299) call “customer knowledge value (CKV).” Specifically, the classic approach to CKV focuses on engaging with consumers directly and in their networks (e.g., in brand communities) in order to generate what might be termed
We propose that CKV should also include implicit feedback derived from real-time transactional responses (see Figure 1)—for example, how customers react to promotional campaigns such as redemptions, and the effectiveness of different promotions with different customer groups. Unlike traditional transactional behavior, real-time interactions go beyond the monetary value of sales and focus on the knowledge generated from customer responses when interpreted as a signal of preference or motivation. This expansion of CKV requires a methodology aimed at capturing, interpreting, and acting on dynamic responses, as traditional analytical models tend to be static in nature, do not learn from ongoing behavioral patterns, and fail to update decisions in real time.

Adaptive promotion strategy via CKV and value co-creation framework.
To address this, and as represented in Figure 1, we develop a reinforcement learning (RL) algorithm along with an optimization model that continuously updates promotions based on knowledge of customer behavior and needs—a form of value co-creation that enhances the service experience for the customers while maximizing CLV for the firm. RL is an ideal methodology when functional relationships (e.g., the effect of different promotions on the number of transactions for different customer groups) are unknown, and important variables (e.g., share and size of wallet) are not easily observable (Sutton and Barto 1998).
The RL algorithm uses real-time customer behavioral responses to applied promotions to update the estimates of the effectiveness of various promotions on different customer groups. This aligns with the concept of value emerging through interactions, and the updated customer response parameters resulting from this are, in turn, used to optimize the marketing strategies to maximize CLV. RL is used in interactive, digital service environments like chatbots and virtual assistants to tailor responses to customer preferences (e.g., they ask a customer to choose between two of their responses). This information is used to tailor the style of their future responses to that customer (i.e., to refine interaction based on user feedback). These systems work in a way that aligns with the principles of CKV, although they have not been framed as such. In the current example, this link is formalized such that the structured optimization model considers business constraints, such as limited resources, ensuring that promotional strategies are in line with customers’ preferences and are also both feasible and cost-effective. The model also encourages co-creation in the optimization by adding customers’ feedback into the equation. Figure 1 summarizes the problem and the approach to the solution.
Our study extends work on CLV by including customers as active participants in shaping future promotional strategies (i.e., co-creators of value) in settings where customer information is incomplete, such as in noncontractual settings where there is limited knowledge about customer behavior (e.g., their share of wallet). The work responds to calls for dynamic planning of multiple promotional campaigns (e.g., Ascarza et al. 2018), for the integration of machine learning models in marketing studies (e.g., Ngai and Wu 2022), and for work that links variables that are not easily observed, such as share and size of wallet, with future-oriented metrics such as CLV for resource allocation problems (Petersen et al. 2009). Thus, we advance service marketing theory by demonstrating how service strategies, and CKV in particular, can be co-created through feedback loops that are both dynamic and data-driven, and by treating customers as value co-creators. We also advance service practice by presenting a methodology for implementing the co-creation of value, illustrating the application of this approach using data collected from a ferry travel agency and simulated scenarios.
In the following sections, we first review related literature and then describe the optimization model for suggesting the best promotion strategies and discuss how it can be adapted for different business rules. We also explain how a learning algorithm is used to update the estimations in the optimization model as new information becomes available. We then outline the application of this approach to a service setting, before discussing the theoretical and managerial implications, along with directions for future research.
Literature Review
CLV and Resource Allocation
We focus on research using CLV as the key outcome metric in resource allocation. Table 1 compares key dimensions of our proposed methodology against related studies in the literature on CLV and promotional resource allocation. Methodological approaches in previous studies differ significantly from our approach, with some using a static optimization approach (Berger and Bechwati 2001; Venkatesan and Kumar 2004; von Mutius and Huchzermeier 2021) while others have provided predictive insights without formally optimizing marketing decisions (Reinartz and Kumar 2003; Rust, Kumar, and Venkatesan 2011). The research most closely related to our work is that of von Mutius and Huchzermeier (2021) and Rust, Kumar, and Venkatesan (2011). Like von Mutius and Huchzermeier (2021), we optimize promotional strategies to maximize CLV, but our methodologies and approaches differ. Our model uses a dynamic learning algorithm to adapt and refine promotional strategies in real time while they employ a more static analytical model to assess the effectiveness of category-specific coupons. Rust, Kumar, and Venkatesan (2011) use Monte Carlo simulation to dynamically predict the probability of the CLV and then decide on marketing efforts based on this prediction. Instead of the CLV driving the promotional campaign, our approach dynamically changes this scenario by measuring the impact of past promotional campaigns on CLV. Monte Carlo simulation models and RL are both used in machine learning, with the former being used more for prediction modeling (e.g., dynamically predicting CLV), whereas the latter is a type of machine learning that allows the decision-maker (e.g., manager) to learn through the consequences of actions (e.g., the effect of a promotion on the number of transactions). Their simulation helps in identifying the most valuable customers rather than optimizing marketing campaigns to maximize CLV.
Resource Allocation Strategies Using CLV.
As demonstrated in Table 1, our approach extends previous research across methodological and theoretical dimensions, emphasizing in particular dynamic real-time personalization and value co-creation, which remain underexplored in existing literature. Since our framework relies on customer interactions to continuously refine promotional strategies, it is important to explore value co-creation literature in order to better understand the role of customers in shaping the effectiveness of marketing.
Integrating Value Co-Creation in Optimizing Promotional Strategies
Value co-creation originated from service-dominant logic, which states that value is dynamically co-created with customers (Vargo and Lusch 2008), where interactions between the latter and suppliers shape the meaning and relevance of value (Chandler and Vargo 2011). The importance of value co-creation is illustrated in different marketing contexts, including service system design (Moeller et al. 2013), customer loyalty (Cossío-Silva et al. 2016), personalization technologies (Peña-García, Losada-Otálora, and Juliao-Rossi 2022), and multi-channel segmentation (Hosseini, Shajari, and Akbarabadi 2022). Kumar et al. (2010) emphasize the importance of using CKV (feedback, referrals, and participation in innovation) for improving service and product, and thus co-creating value.
Tuunanen et al. (2024) outline five micro-level mechanisms (i.e., social use, customer orientation and decision-making, service experience, service use context, and customer values and goals) to support value co-creation in the design of digital services. Among these mechanisms, customer orientation and decision-making are most relevant to our model, as integrating micro-level mechanisms such as customer orientation (real-time feedback from customers) to guide the decision-making in optimizing promotions facilitates alignment with customers’ current needs.
The two core contributions of this research are, first, the expansion of the CKV concept to include implicit, real-time behavioral feedback, and second, the development and demonstration of a dynamic customer-driven marketing framework that maximizes CLV. This is achieved through the integration of customer segmentation, RL, and optimization techniques. The approach represents an improvement over traditional model-based approaches as it provides a more responsive and collaborative way to enhance the effectiveness of marketing by embedding value co-creation directly into the decision-making process. Specifically, the initial impetus underpinning the approach we present here is a conversation with customers captured through survey data aimed at identifying preferred sources of value, thus making the customer an
Framework of the Proposed Methodology
The conceptual framework of this research, which has three core steps, is presented in Figure 2. The first step is creating customer groups and calculating their CLV, beginning by segmenting customers based on their historical purchasing data, including factors such as frequency, recency, and monetary value, with the CLV being worked out for each group to estimate their future value. The expected future purchases of each customer group will be used to tailor promotional campaigns more effectively. This step will be described later in the article when applying the framework to an online ferry travel agency.

A conceptual framework for resource allocation model incorporating a learning algorithm.
The second step is developing an optimization model taking into consideration various factors, such as expected future purchases, the costs of promotions, available resources, and potential returns. The purpose of this model is to allocate marketing resources efficiently and to maximize the overall CLV, that is, finding the right promotion for the right customer. The promotions recommended by the optimization model are then offered to each customer group.
In the third step, customer responses (data on how customers reacted to the promotions and changes in their purchasing behavior) are collected at the end of the promotion period. At this stage, the learning algorithm updates the parameters of the optimization model using the new information on customer responses, incorporating a collaborative process where customer feedback directly influences the optimization of marketing strategies, creating value for both the company and the customers.
Optimization Model of Promotional Strategies
We use the following notation to present a generalizable model. The objective is to maximize the collective CLV through promotional campaigns; that is, we assess which promotional campaign should be offered to each customer group, if any, and for how long.
The optimization model considers the costs of the promotions, the available resources, and the business rules it must follow, and it can alter its optimal decision according to the business rules. In our approach, we have explored three different business rules for offering promotions. With the first rule, promotions are offered exclusively to the most profitable customer groups; this is ideal when communication among customers is minimal, such as in luxury retail contexts, where high-end brands, such as Gucci and Louis Vuitton, often target their most loyal customers with exclusive promotions and discounts. Another example is that of airlines that offer frequent-flyer incentives such as priority boarding, free upgrades, and lounge access.
With the second rule, all customer groups receive the same promotional campaign, but the duration of the promotion can vary based on customer segmentation, as different customer segments may need distinct approaches to increase their engagement, such as longer promotions for less engaged customers. This business rule may be suitable when some communication is expected among customers through forums. For example, Amazon Prime Day is a global shopping event for Prime members, offering the same deals and discounts to all such customers; however, the duration of access (e.g., early access and extended offers) to these deals can vary based on the customers’ engagement level and average spending.
Finally, the third rule is that if a promotional campaign is offered, it needs to be the same for all customer groups. This is ideal when a high level of communication among customers is expected, as it ensures equal treatment in order to maintain satisfaction. For example, a popular retail chain’s customers regularly discuss promotions and sales on social media and forums, so to ensure that all its customers feel equally appreciated, the retailer may offer a site-wide sale with the same discount percentage. While we explore these specific rules to demonstrate the potential of the approach, the model is flexible, and other business rules can be incorporated by adapting the business rule constraint.
Parameters
Key Assumptions
For ease of modeling, we make several simplified assumptions. First, we use the average past expenditure and transaction frequency of customers within each group as the inputs for our calculations. Second, we consider an exclusive campaign application in which each customer group can receive no more than one promotional campaign at a time; the application analyzes an existing pool of customers. We assume static customer grouping, where customer groups are predefined and do not change during the planning period. This is because dynamically updating customer groups while the model is still learning can make it difficult to tell whether changes in behavior are caused by the promotions or by the shifting of customer segments. For example, if a low-frequency customer receives a promotion and increases their purchase rate, dynamic segmentation might reclassify them as medium-frequency mid-training. We also assume that customers are not strategic, as in they do not change their current behavior in anticipation of promotions. Finally, the expected increase in transactions due to promotions is initially estimated and subsequently updated based on real-time data using the dynamic learning algorithm. It is assumed in the planning horizon that profit margins and the discount rate are constant.
Decision Variables
Objective Function
The objective of the model is to maximize the collective CLV through promotional campaigns.
Business Rule 1: Only Profitable Customers are Offered Promotions
Under Business Rule 1, the optimization model can be formulated as follows:
Subject to:
The first term in the objective function (1) calculates the present value of the additional revenue generated from the increased transactions resulting from promotional campaigns. The additional transactions (
The objective function (1) can be reorganized into the format below.
Based on the above properties, a method is developed and used to solve the problem. This method is described in Appendix A (descriptions and Procedures 1 and 2).
Business Rule 2: Same Promotions for Customer Groups, with Varying Lengths of Time
Under this business rule, if any customer group is offered a promotion, all customer groups must be offered the same one. However, the promotions offered can be for different lengths of time.
Here, we have the same model as when using Business Rule 1 with an additional constraint (6).
The right-hand side (RHS) of a constraint (6) sums up all the
Under Business Rule 2, the solution will be either not to apply a promotion to any customer or to apply the same promotion
Business Rule 3: Identical Promotions Offered to All Customer Groups
Under this business rule, if any customer group is offered a promotion, all customers are given it for the same length of time.
We have the same model as for Business Rule 1 with an additional constraint (7).
Under Business Rule 3, the solution will be either not to apply a promotion to any customer or to give every group the same promotion for the same length of time
Improving Model Parameters: A Dynamic Learning Approach
To improve the model parameters dynamically, a RL approach (multi-armed bandit algorithm) is used. The model parameter
The algorithm starts with the decision-maker initializing the values
Two important elements in a learning algorithm are the exploration rate and the learning rate. Exploration is attempting decisions that are considered suboptimal to gather information that can enhance the learning algorithm’s policy over time, and exploitation exercises the best policy thus far to maximize the expected profit. It is important to balance the two: If there is no exploration, we can get trapped in a policy that is no longer optimal, but the more time we spend on learning the demand, the less time is spent exploiting promotions that optimize profit.
Another element is the learning rate, which determines how quickly the model parameters are updated when new data become available (Sutton and Barto 1998). If the way customers react to promotions changes dynamically, we value new information more and can choose a higher rate for the learning parameter to ensure our decision-making reflects this change. The advantage of using the learning approach is that we can learn the accurate value of
We incorporate the learning algorithm with the optimization model, which can update the initial estimates of parameters in the model based on real-time customer responses. The learning method can also be used to update the factor reflecting changes in contribution.
The Proposed Methodology Demonstrated on a Ferry Travel Agency
We demonstrate the effectiveness of the proposed framework on simulated scenarios for an online ferry travel agency that serves multiple ferry tour operators. To acquire customers, the agency creates awareness of the tour services by advertising through search engines such as Google and earns a commission on sales. Marketing activities, particularly those aimed at retention, can influence customer behavior and consequently the CLV and profitability (Gupta et al. 2006). Our focus is on retention, arguably the most imperative component of the CLV framework (Gupta, Lehmann, and Stuart 2004). Given the agency’s small profit margins, targeting the right customer with the right promotion is crucial to avoid profit losses, particularly when competition is fierce (Abrate et al. 2020).
Customer Groups and CLV
The first step in our framework is to create customer groups and calculating the CLV. Table 2 illustrates an example of the data, showing the subsequent purchases of the first-time customers “acquired” in January 2018. We aim to predict the likelihood of a customer remaining active (i.e., still purchasing from the firm) and their expected purchases in future months using a probabilistic model based on their past buying behavior. The ferry travel agency has two types of customers: frequent buyers, and holiday customers who purchase ferry trips only once a year. The purchases by the first customer type often occur at random, and their data show a very small number of trips on a weekly basis (.001%) and the majority on a monthly basis. Hence, it is more appropriate to model the number of transactions in each time period (i.e., monthly) as a Bernoulli process rather than a Poisson one. We apply the Beta-geometric/Beta-binomial (BG/BB) model, which is straightforward to implement and effective in forecasting customer behavior (Fader, Hardie, and Shang 2010). This model is suitable because it handles discrete transaction periods and is computationally less intensive than the Pareto/NBD model first developed by Schmittlein, Morrison, and Colombo (1987). Other CLV models, such as the RFM (recency, frequency, monetary value) model, could also be integrated into the proposed framework.
Monthly Buying Behavior by the January 2018 Cohort of First-Time Buyers.
To calculate CLV, we follow a two-step process (e.g., Fader, Hardie, and Lee 2005; Venkatesan and Kumar 2004). First, we forecast the future number of transactions using the BG/BB model. Then we estimate the individual average spend/profit per transaction using the gamma-gamma model (Fader, Hardie, and Lee 2005). CLV is then calculated as the future number of transactions multiplied by the profit per transaction, discounted, and summed over the planning horizon.
Each customer’s information is classified by recency and frequency of purchases. The notation used to represent this information is
As previously mentioned, there are two types of customers, and their buying distribution behavior differs. If we put them together, it will appear that the yearly customers are no longer active with the company while they are between infrequent purchases. We focus on the more frequent buyers (the monthly buyers) because there is more opportunity to influence their behavior through promotions aimed at increasing their CLV. For the monthly customers, we consider 1 year of data in investigating the behavior of those who first purchased in January 2018. During this period, the firm did not offer any promotions. Six months (February–July) of data are used for fitting the model (to understand the customer buying behavior and generate the probability distribution) and calculate the CLV for the period August to December.
2
So here,
Example of Frequency/Recency Summary of the Monthly Purchase Behavior of the 2018 January Cohort of First-Time Buyers.
Table 4 depicts the relationship between frequency and recency and the expected number of transactions. The expected number of transactions in the forecast period increases, as the last time of purchase is more recent. This shows that the longer the interval between making purchases, the more likely it is that the customer is no longer active. The conditional expectation is also an increasing function of the number of repeat transactions in the 6-month calibration period. A customer who has made a transaction every month from February to July is expected to make 3.10 transactions over the next 5 months, whereas one who has not made a purchase since making their initial transaction is expected to make only .03 repeat transactions over that period. However, this category of customers makes up 90.5% of the entire cohort. Taken together, and critically, the customers of this category, despite the low expected number of transactions for each one of them, are expected to make over 10,502 transactions in total during the next 5 months. This makes them collectively more valuable than all the other frequency/recency categories, as can be seen from Tables 3 and 4.
Expected Number of Repeat Transactions from August to December as a Function of Frequency and Recency.
To calculate CLV, we multiply the expected number of transactions calculated by the BG/BB model from Table 4 by the discounted average expected contribution over the planning horizon (in our case, i.e., 5 months). Note: We use contribution rather than the spending on a future purchase for each customer because the agency only receives a percentage, that is, a commission rate of the actual spending. We use the gamma-gamma model (Fader, Hardie, and Lee 2005) to predict the expected contribution for each customer.
An example of the differences in distribution for different customers is presented in Figure 3 (each row/line is a different customer). We can aggregate the customers’ contributions into three levels (i.e., £15, £22, and £40) as the data clustered largely into these groups. We refer to the customers in these groups as low-, medium-, and high-contribution customers to facilitate comparing the effects of promotional campaigns for different levels of contribution.

Distribution of transaction contributions of individual customer.
The number of future transactions may be increased by targeting promotional campaigns. To gain an initial understanding of the effects of different promotional campaigns, the approach utilizes information obtained through engaging directly with customers, and enrolling them as active players in the creation of the promotional approach to be used by the firms. For this purpose, an online survey was given to a sample of the ferry travel agency’s customer base.
Initial Estimation of Promotional Campaign Effectiveness
Initially, the parameters in the resource allocation model are estimated using information from survey data. A short online questionnaire was developed for the survey in order to gain insight into customer preferences for promotional campaigns and to better understand the demand for ferry trips. For one week, the survey was available to all customers who visited the company’s website. Customers were asked to fill out a questionnaire using the Google survey shown in Appendix C. Convenience sampling was used, and a response rate of 3.3% was obtained, giving a sample of 3,000 questionnaires to be analyzed, and while there is likely to be some response bias, this is deemed to be nonproblematic. First, the primary purpose of the current study is to demonstrate the initial effectiveness of our methodology, rather than to provide definitive conclusions about the population. Second, the survey data provide a starting point for the RL approach to begin its learning process, and so self-correction is an inherent aspect of the methodology.
Thus, to assess which promotional campaign would be most effective and to understand the potential to increase demand (e.g., expected increase in the number of trips by applying a particular promotion), we classify customers into 19 categories according to their different frequency/recency profiles and examine the customer categories against their preference for promotion. There are very few customers with a frequency of purchase greater than or equal to 6, hence we combine the categories of 5 and 6+ and create a category 5+. Figure 4 provides a visual representation of promotion preferences across different customer categories. The size of each circle represents the proportion of responses for each promotional strategy within each category, scaled to account for the relative size of the category.

Promotion preference for each frequency/recency customer profile.
In the frequency/recency column, the first number represents the frequency of purchases, and the second indicates the last time the customer bought something. For example, 1/1–3 means those who have purchased once in the last 6 months, and the last purchase was made between 1 and 3 months ago. The upper, median, and lower sample group sizes were 320, 141, and 40, respectively. The normalized means and standard deviations of preferring different promotional campaigns are presented in Table 5.
Summary Statistics of Customer Preference.
Examining Figure 3 and Table 5, it is evident that loyalty discounts are the most widely preferred promotional campaign, as indicated by the significantly larger circles in this column and a high normalized mean of 65.6% and a standard deviation of 22.7%. The standard deviation shows a noticeable variability in preferences among different customer categories.
To validate these observations, we conducted an analysis of variance (ANOVA) test on the data to assess whether or not the mean scores for each promotion were statistically the same. The ANOVA test (

Total demand for ferry trips.
Results
Parameter Values for the Optimization Model
We provide details of the parameter values used in the optimization model; the parameter values
The optimization model only considers four types of promotions (
The resource allocation model is run on the estimated parameters derived from the survey results for 14 customer types grouped by frequency and recency categories (
Figure 6 illustrates the profit differences among the three rules, aiding the company in assessing the variations between them. When profit differences are small, companies might opt to make slightly less profit in favor of improving customer relationships.

Profit difference between business rules.
The model results indicate that promotional discounts are considered the most effective campaign. Under Business Rule 3, the optimal strategy is to offer a 7% discount to all customer groups for 5 months. Figure 7 draws a comparison with the 7% discount for the low-contribution group, highlighting the difference between expected and promotional profit.

Expected and promotional (7% for 5 months) profits for the low-contribution level.
When a gray bar is shorter than a black one, the promotional campaign results in a loss for that customer group. This means that the cost of the discounts exceeds the revenue generated from the increase in sales due to the promotion, which is the case for customer categories characterized by high frequency and recency (e.g., 5/up to 1 month, 4/1–3 months, 3/up to 1 month). The customer categories with the largest positive difference between promotional and expected sales profit are those with a lower frequency and when purchases were made 3 to 6 months ago (e.g., 1/6 months ago, 2/3–6 months, 3/3–6 months). Under Business Rule 1, promotions would only be offered to profitable customer groups. The survey results show that these groups exhibit demand for the service but share their wallet with other ferry travel agencies or operators. Business Rule 2 proposes an optimal campaign to offer a 7% discount for 1 month to all nonprofitable customer groups (i.e., those whose expected profit is greater without a promotional campaign) and a 7% discount for 5 months to all remaining customer groups. We consider another business rule where if one customer group receives a promotion, all customer groups must receive it as well. However, the promotion can vary for each group and be offered for different lengths of time. Figure 8 shows the optimal promotions for each customer category and contribution level.

Promotional campaigns for each contribution level.
The results emphasize the importance of utilizing marketing metrics like CLV together with knowledge of share of wallet when considering promotional campaigns. Intuitively, managers may believe that customers who purchase more frequently should be the primary targets for loyalty discounts, especially when focusing solely on their purchase amounts; however, our findings suggest that this intuition may not be accurate in this case.
We benchmarked our model against that of von Mutius and Huchzermeier (2021) using our data. We simplified our model, specifically by fixing the duration of the promotion and not including the dynamic learning mechanism, to match their model assumptions. The results demonstrated that the models performed comparably (an example of this is shown in Appendix E), which shows that our model, despite its added adaptability, maintains performance under more straightforward conditions. However, by including detailed timing for promotions and dynamic learning, our model significantly enhances its optimization capabilities. Detailed timing enables tailored promotional durations based on customer behavior, leading to more precise and effective campaign planning as illustrated when comparing Business Rules 2 and 1. The dynamic learning mechanism is crucial, especially since survey results can quickly become outdated. To avoid customer switching and loss in the long term, the parameters in the model must be updated dynamically. Next, we illustrate the benefits of using a learning approach when the initial assumptions are inaccurate.
Benefits of a Learning Approach
Our approach incorporates a RL algorithm that continually updates its parameters based on incoming data. This adaptive mechanism is designed to adjust the parameters over time, allowing them to converge towards their accurate values as more data becomes available. This aspect of our methodology helps mitigate some of the initial biases and inaccuracies in the survey data. For example, if we assume that promotions are equally effective throughout the entire promotion period, the

Heat map showing the difference in
Figure 10 shows the optimal promotional offers with the initial and new

Under business rule 2 comparison of the optimal time (in months) for promotions of 7% discount for the different values of
This is an example of one naïve assumption. If other unobservable variables are not included in the model, it is likely that we are far from the true optimal policy. This is particularly important for companies that have small profit margins, such as the ferry travel agency. The advantage of using a learning algorithm is that we can learn and update our policy without explicitly observing all variables, which is useful, because even if we start with the wrong assumption, we can learn the correct policy over time. We initially assumed that promotions were equally effective across each of the months in the promotion period (initial

Learning curve over time with various learning parameters under business rule 2.
When the learning algorithm can derive accurate estimates of the parameters, the question could be raised as to why we should bother engaging with customers to make them active participants in the value creation process (in our case, through the application of a survey). The answer is that without survey data, we would not have even a rough idea of how different promotions affect the demand for different customer groups. If we had followed our intuition and applied promotions to the most profitable customers for 5 months, we would have made a loss of £11,948.03. We may then change our strategy to apply promotions to those who last purchased 6 months ago. With this poor initial setting, it can take years to arrive at an optimal strategy, without understanding the root cause of ineffective promotions. Survey data help us initialize the parameters to sensible starting decisions.
Discussion
We present a dynamic learning framework for optimizing promotional campaigns to maximize CLV, using customer behavioral responses—an approach embedding value co-creation. This study contributes to literature on CLV and value co-creation, both theoretical and practical.
Theoretical Contribution
We expand the features of CKV (Kumar et al. 2010) to include implicit feedback from digital interactions (i.e., customer real-time behavioral responses to marketing campaigns). This reduces reliance on direct feedback mechanisms (such as survey data) and offers real-time adaptation through analyzing interaction patterns. We develop a dynamic CLV optimization model, which includes real-world constraints such as limited resources and can be adapted to different business rules, making the model more generalizable to different industries. Accordingly, we demonstrate how analytics and data-driven approaches can be used to inform CRM strategy, and specifically, we show how RL and optimization can be employed to incorporate value co-creation to drive customer-centric marketing decisions. Furthermore, we operationalize a dynamic value co-creation method, where customers are seen as an integral part of the process, not just receivers of the service. Our findings suggest that such co-creation processes can lead to more personalized and effective service strategies, ultimately contributing to an increase in CLV.
Managerial Implications
Managers tend to target more profitable customers with promotions in resource-constrained environments. However, our findings show that these customers may already spend all of their share of wallet with the company, and hence offering them discounts can reduce profits without increasing CLV. Rather than basing decisions on historical profitability alone, managers need to assess the incremental value potential. We found that mid-tier customers had great potential to increase profit if targeted appropriately by matching them with the right promotion and duration.
The proposed framework offers a practical and scalable method for adapting promotional strategies to changing customer behavior. In contrast, static models use historical data and can suffer if they are based on incorrect assumptions (e.g., promotions are equally effective across the whole duration of the promotion period). Thus, the RL approach allows firms to continuously improve the relevance and effectiveness of their marketing efforts. For example, in an ongoing process, the firm can empower customers to engage proactively with it in creating value. For instance, periodically the firm can generate customer insights (e.g., through focus groups) to create a picture of customers’ evolving needs and wants and can thereby generate an array of new potential promotional offers. The latter can be included in a survey where customers provide feedback on their preferences regarding these emergent ideas, and the information emerging from the survey can help the firm choose new promotions to enter into the dynamic learning model.
While this methodology can be applied to other sectors such as e-commerce and retail, it is especially effective in digital service settings like online travel agencies where customer engagement is critical. Here, switching between providers is effortless, and customer preferences shift rapidly, making it essential to continuously adapt and retain users. Furthermore, we have developed a practical solution approach (Appendix A) that addresses optimization challenges when the dimensionality of the problem increases, that is, the number of groups and promotions grows. Our model aligns with recognized growing needs in the service context (e.g., chatbots, recommendation engines), offering a blueprint for decision-making that is both data-driven and customer-centered.
Limitations and Future Research
As with all modeling approaches, this study has limitations that can lead to new interesting possibilities for future research. One of these is to relax the constraint in the model of the exclusive campaign application and allow customers to receive more than one promotional offer and measure its impact on CLV.
In the current framework, we have fixed customer segments over the learning phase. A benefit of this approach (i.e., assuming that segments are static) is that we can generate an understanding of the effect of promotions across different customer groups over time and so can isolate the impact of promotions on changes in customer behavior. Specifically, it allows the process of learning about responses to promotions to stabilize. Nevertheless, a drawback is that the approach entails not accommodating changes in customer segments, which may be less effective, particularly under high levels of customer dynamism. Accordingly, future work would benefit from exploring hybrid models that explicitly incorporate the ability to adjust to evolving customer segments. For example, such models could alternate between periods of maintaining segments structures for learning stability and that of revisiting segment composition.
It is also worth noting that the three business rules used in the empirical example demonstrate a diverse range of strategies that the firm may choose to follow when refining the model for use in-house. It should be noted that we make no claims as to whether the specific features of the rules are appropriate for any individual business; rather, it will be necessary for businesses to determine which rules they deem to be most fitting for their own business situation. Fortunately, a benefit of the approach is that the rules can be quite easily adapted and tweaked to the strategic preferences of the firm.
Another interesting area to explore is a model that considers strategic customer behavior. Researchers may consider integrating game-theory approaches to take into account customers who adjust their buying behaviors in anticipation of promotions.
Selecting the channel for promotions could also be built into the optimization model, so that the framework could tailor the type of promotion, duration, and its optimal delivery channel. Finally, a longitudinal study would help in assessing the long-term impact of RL-driven strategies on customer engagement, loyalty, and business performance, which would provide further evidence of the effectiveness of these approaches.
Supplemental Material
sj-docx-1-jsr-10.1177_10946705251365524 – Supplemental material for Optimizing Promotional Campaigns to Maximize Customer Lifetime Value: A Dynamic Learning Approach
Supplemental material, sj-docx-1-jsr-10.1177_10946705251365524 for Optimizing Promotional Campaigns to Maximize Customer Lifetime Value: A Dynamic Learning Approach by Rupal Mandania, John Cadogan, Jiyin Liu and Nayyar Kazmi in Journal of Service Research
Supplemental Material
sj-docx-2-jsr-10.1177_10946705251365524 – Supplemental material for Optimizing Promotional Campaigns to Maximize Customer Lifetime Value: A Dynamic Learning Approach
Supplemental material, sj-docx-2-jsr-10.1177_10946705251365524 for Optimizing Promotional Campaigns to Maximize Customer Lifetime Value: A Dynamic Learning Approach by Rupal Mandania, John Cadogan, Jiyin Liu and Nayyar Kazmi in Journal of Service Research
Footnotes
Declaration of Conflicting Interests
Funding
Supplemental Material
Authors Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
