Optimization Models

1.1 Mathematical Programming Models

Linear programming, integer linear programming, nonlinear programming, multi-objective programming, dynamic programming.

1.2 Differential Equation Models

Stagnation growth model, SARS transmission model.

1.3 Graph Theory and Network Optimization Problems

Shortest path problem, maximum flow problem, minimum cost maximum flow problem, minimum spanning tree problem (MST), traveling salesman problem (TSP), graph coloring problem.

1.4 Probability Models

Decision models, stochastic storage models, stochastic population models, newspaper boy problem, Markov chain models.

1.5 Classic Problems in Combinatorial Optimization

1.5.1 Multi-dimensional Knapsack Problem (MKP)

Knapsack problem: given items, with volume, and knapsack capacity. How to load as many items as possible into the knapsack. Multi-dimensional knapsack problem: given items, with value, volume, and knapsack capacity. How to select items to maximize the total value of items in the knapsack. Applications of the multi-dimensional knapsack problem include resource allocation, cargo loading, and storage allocation problems. This problem is classified as NP-hard.

1.5.2 Two-dimensional Assignment Problem (QAP)

Job assignment problem: given jobs that can be completed by workers. The time taken by workers to complete jobs is given. How to arrange jobs to minimize total working time. The two-dimensional assignment problem (often exemplified by machine layout problems): given machines to be arranged in locations, the logistics between machines are given, and the distance between locations is given. How to arrange to minimize costs. Applications of the two-dimensional assignment problem include campus building layouts, hospital department arrangements, and grouping technology in processing center composition.

1.5.3 Traveling Salesman Problem (TSP)

Traveling salesman problem: given cities, with distances between cities, find a route that visits each city exactly once and returns to the starting point, minimizing the total distance.

1.5.4 Vehicle Routing Problem (VRP)

The vehicle routing problem (also known as vehicle scheduling): given customer locations and cargo demands, under constraints of available vehicles and carrying capacity, each vehicle departs from the starting point, completes delivery tasks to customer points, and returns to the starting point, aiming to minimize the number of vehicles and total distance traveled. TSP is a special case of the VRP.

1.5.5 Job Shop Scheduling Problem (JSP)

The job shop scheduling problem: given jobs and machines, each job consists of a series of operations, with strict serial order for execution. Each operation requires a specific machine at a given time, and no machine can simultaneously execute different jobs. The goal is to minimize the time interval from the start of the first operation to the end of the last operation.

Classification Models

Discriminant analysis is based on known observations of samples divided into several types, establishing a discriminant function based on certain criteria, and then performing discriminant analysis on unknown types of samples. Cluster analysis, on the other hand, does not pre-specify types for a given batch of samples, but determines types through analysis within the group.

2.1 Discriminant Analysis

2.1.1 Distance Discriminant Method

Basic idea: First, calculate the centroids of each class based on known classification data; the discriminant criterion is that for any given observation, if it is closest to the centroid of class, it is considered to belong to that class. The distance can be measured using Euclidean distance, Mahalanobis distance, Minkowski distance, etc., based on actual needs.

2.1.2 Fisher Discriminant Method

Basic idea: Extract samples with indicators from two populations, constructing a discriminant function using the idea of variance analysis. The principle for determining coefficients is to maximize the distinction between the two groups while minimizing internal deviations within each group. For a new sample, substitute its p indicators into the discriminant function to obtain a y value, then compare it with the discriminant threshold (or boundary point) to determine which population it belongs to. Under the assumption of equal prior probabilities for the two populations, the discriminant threshold is generally taken as: Finally, use a statistical test to evaluate the discriminant effect; if it is effective, the discriminant is considered valid; otherwise, it is invalid. The above describes two-population discrimination; multi-population discrimination methods need to be extended. As the number of populations increases, the established discriminant functions also increase, making calculations more complex.

2.1.3 Bayes Discriminant Method

Basic idea: Assuming some knowledge about the studied object, i.e., the prior probability of the k-th population within k populations is known, and the probability density function is given. Using Bayes’ theorem, calculate the posterior probability of the observed sample belonging to the k-th population; if it is the largest, classify it as that population.

2.1.4 Stepwise Discriminant Method

The basic idea is similar to stepwise regression, using an “in and out” algorithm, progressively introducing variables, and simultaneously considering the removal of certain variables that are not significant from earlier introduced discriminant functions.

2.2 Cluster Analysis

Cluster analysis is an unsupervised classification method, meaning no pre-specified categories. Depending on the classification objects, cluster analysis can be divided into sample clustering (Q-type) and variable clustering (R-type). Sample clustering focuses on classifying observed samples, while variable clustering aims to identify independent and representative independent variables without losing significant information. Variable clustering is a dimensionality reduction method.

2.2.1 Hierarchical Clustering Method

Basic idea: Start by treating each sample as its own class; then calculate the distance between each pair of classes, merging the two closest classes. Repeat until all samples are merged into one class. Applicable to both sample and variable clustering, with multiple distance classification criteria and distance calculation methods available for specific situations.

2.2.2 K-means Clustering Method

Basic idea: Specify the number of clusters, select initial cluster centers; calculate the distance from each observation (sample) to each cluster center, assigning them to the nearest cluster; recalculate the cluster centers, and repeat until stopping criteria are met (e.g., maximum iterations).Usage: Requires users to specify the number of clusters; only applicable to sample clustering (Q-type), not suitable for variable clustering (R-type).

2.2.3 Two-step Clustering Method

Basic idea: Perform pre-clustering followed by formal clustering. Applicable as an intelligent clustering method for large datasets or complex category structures, capable of handling both discrete and continuous variables, automatically selecting the number of clusters, and processing extremely large sample sizes.

2.2.4 Fuzzy Cluster Analysis

2.2.5 Cluster Methods Combined with Genetic Algorithms, Neural Networks, or Grey Theory

Evaluation Models

3.1 Analytic Hierarchy Process (AHP)

Basic idea: A multi-criteria decision and evaluation method combining qualitative and quantitative aspects. Decomposes decision-related elements into goal, criterion, and alternative layers, ranking decision alternatives based on human judgment, followed by qualitative and quantitative analysis. It hierarchizes and quantifies human thought processes, providing quantitative bases for analysis, decision-making, evaluation, forecasting, and control.

Basic steps: Construct a hierarchical structure model; create a pairwise comparison matrix; perform hierarchical single ranking and consistency check (i.e., checking whether the subjectively constructed pairwise comparison matrix has good overall consistency); perform hierarchical total ranking and consistency check (checking consistency between layers).

Advantages: It relies entirely on subjective evaluations for ranking alternatives, requiring little data and short decision-making time. Overall, AHP introduces quantitative analysis into complex decision processes, effectively utilizing preference information provided by decision-makers in pairwise comparisons for analysis and decision support, absorbing qualitative analysis results while leveraging quantitative analysis advantages, resulting in a highly systematic and scientific decision-making process, especially suitable for decision analysis in socio-economic systems.

Disadvantages: The decision-making process using AHP is highly subjective. If the decision-maker’s judgment is overly influenced by subjective preferences, leading to a distortion of objective laws, the results of AHP become unreliable.

Applicable Scope: Especially suitable for situations where qualitative judgments play a significant role and where decision results are difficult to measure accurately. To ensure AHP decision conclusions align with objective laws, decision-makers must have a deep and comprehensive understanding of the issues at hand. Additionally, when faced with numerous factors and large-scale evaluation issues, this model can encounter problems, requiring evaluators to thoroughly grasp the nature of the problem, the elements involved, and their logical relationships; otherwise, evaluation results may be unreliable and inaccurate.

Improvement Methods:

(1) The pairwise comparison matrix can be obtained using the Delphi method.

(2) If the number of evaluation indicators is too large (generally exceeding 9), the weights derived from AHP will have certain biases, making the combined evaluation model results unreliable. Based on the actual situation and characteristics of the evaluation object, certain methods can be used to layer and categorize the original indicators, ensuring that the number of indicators in each layer is less than 9.

3.2 Grey Comprehensive Evaluation Method (Grey Correlation Analysis)

Basic idea: The essence of grey correlation analysis is to compare and rank evaluation objects based on the degree of correlation between each scheme and the optimal scheme. The greater the correlation, the more consistent the comparative sequence is with the reference sequence; conversely, the more contrary the change trends are. This leads to evaluation results. Basic steps: Establish the original indicator matrix; determine the optimal indicator sequence; standardize or dimensionless process the indicators; calculate the difference sequence, maximum difference, and minimum difference; compute correlation coefficients; calculate the correlation degree. Advantages: It is an effective model for evaluating systems with a large amount of unknown information, combining qualitative and quantitative analyses. This model can effectively address issues where evaluation indicators are difficult to quantify accurately and statistically, eliminating the impact of human factors, leading to more objective and accurate evaluation results. The entire calculation process is simple, easy to understand, and manageable; data do not need to be normalized, allowing direct calculations with raw data, ensuring strong reliability; the evaluation indicator system can be adjusted based on specific situations; and it requires only a small number of representative samples.Disadvantages: Requires sample data with time series characteristics; it only distinguishes the quality of evaluation objects without reflecting absolute levels, thus comprehensive evaluation based on grey correlation analysis has all the disadvantages of “relative evaluation.” Applicable Scope: No strict requirements on sample size, not requiring adherence to any distribution, suitable for problems with only a small amount of observational data; when using this method for evaluation, the indicator system and weight distribution is a key issue, and the appropriateness of the selection directly affects the final evaluation results. Improvement Methods: (1) Use a combined weighting method: derive weight coefficients based on a combination of objective and subjective weighting methods. (2) Combine with the TOPSIS method: not only focusing on the correlation degree between sequences and the positive ideal sequence but also considering the correlation degree between sequences and the negative ideal sequence, calculating the final correlation degree based on formulas.

3.3 Fuzzy Comprehensive Evaluation Method

Basic idea: Based on fuzzy mathematics, this method quantifies unclear and hard-to-quantify factors using the principle of fuzzy relation synthesis, providing a comprehensive evaluation of the membership levels (or evaluation sets) of various factors for the evaluated object. The comprehensive evaluation assigns a non-negative real number evaluation indicator to each object based on given conditions and ranks them accordingly.Basic steps: Determine the factor set and evaluation set; construct the fuzzy relation matrix; determine indicator weights; perform fuzzy synthesis and make evaluations.Advantages: The mathematical model is simple and easy to grasp, and it performs well in evaluating complex problems involving multiple factors and levels. The fuzzy evaluation model can evaluate and rank objects based on comprehensive scores and also determine the levels of objects based on the maximum membership principle, providing rich information in the results. Evaluations are conducted pairwise, yielding a unique evaluation value for each evaluated object, unaffected by the set of objects being evaluated. It aligns closely with Eastern thinking habits and descriptive methods, making it more suitable for evaluating socio-economic system issues.Disadvantages: It does not resolve issues of evaluation information redundancy caused by correlations between evaluation indicators, and there is no systematic method for determining membership functions, with synthesis algorithms still needing further exploration. The evaluation process heavily relies on subjective judgment, and the determination of weights for various factors carries a degree of subjectivity, making fuzzy comprehensive evaluation a method based on subjective information.Applicable Scope: Widely used in fields such as economic management. The reliability and accuracy of comprehensive evaluation results depend on the reasonable selection of factors, the distribution of factor weights, and the synthesis operator for the comprehensive evaluation.Improvement Methods: (1) Use a combined weighting method: derive weight coefficients based on a combination of objective and subjective weighting methods.

3.4 BP Neural Network Comprehensive Evaluation Method

Basic idea: An interactive evaluation method that continuously modifies the weights of indicators based on the user’s expected output until satisfaction is achieved. Therefore, generally speaking, results obtained from artificial neural network evaluation methods are more aligned with actual situations.Advantages: Neural networks possess adaptive capabilities, providing objective evaluations for multi-indicator comprehensive evaluation problems, which is beneficial for reducing human factors in weight determination. In previous evaluation methods, traditional weight designs often contained significant fuzziness, and human factors heavily influenced weight determination. As time and space progress, the impact of various indicators on their corresponding issues may change, and the initially determined weights may not align with actual situations. Furthermore, given that the entire analysis and evaluation is a complex nonlinear system, establishing a weight learning mechanism is essential; these aspects highlight the advantages of artificial neural networks. To address limitations in variable selection methods during comprehensive evaluation modeling, applying neural network principles allows for contribution analysis of variables, thereby eliminating insignificant and unimportant factors to establish simplified models, avoiding interference from subjective factors in variable selection.Disadvantages: The biggest issue encountered in ANN applications is the inability to provide analytical expressions; weights cannot be interpreted as regression coefficients and cannot be used to analyze causal relationships, and currently, there is no theoretical or practical explanation for the significance of ANN weights. It requires a large number of training samples, has low precision, and its application scope is limited. The greatest barrier to application is the complexity of evaluation algorithms, necessitating computer processing, while commercial software in this area is still immature.Applicable Scope: Neural network evaluation models have adaptive capabilities and fault tolerance, capable of handling nonlinear and non-local large complex systems. During training of learning samples, there is no need to consider weight coefficients between input factors; ANN adjusts and adapts automatically based on the error between input values and expected values, thereby reflecting interactions among factors.Improvement Methods: (1) Use a combined evaluation method: select a portion of results obtained from other evaluation methods as training samples and another portion as test samples for verification, training the neural network until requirements are met, yielding better results.

3.5 Data Envelopment Analysis (DEA)

3.6 Combined Evaluation Method

Forecasting Models

Combining qualitative and quantitative research is the trend in scientific forecasting.In practical forecasting work, qualitative and quantitative predictions should be used together, i.e., making judgments about the future trends of a system based on correct analysis of the system and quantitative indicators derived from quantitative forecasting.

4.1 Regression Analysis Method

Basic idea: Based on the changing patterns of historical data, find the regression equation between independent and dependent variables to determine model parameters for forecasting.Regression problems can be categorized into univariate and multivariate regression, linear and nonlinear regression. Characteristics: The technology is relatively mature, and the forecasting process is simple; it decomposes the influencing factors of the forecasting object, examining the changes of each factor to estimate the future quantitative state of the forecasting object; however, regression models tend to have large errors and poor extrapolation properties. Applicable Scope: The regression analysis method is generally suitable for medium-term forecasting. It requires a large sample size and good distribution of samples; when the forecasting length exceeds the length of the available original data, using this method cannot theoretically guarantee the accuracy of the forecast results.Additionally, there may be discrepancies between quantifiable results and qualitative analysis results, and it can be challenging to find suitable types of regression equations.

4.2 Time Series Analysis Method

Basic idea: Arrange historical data of the forecasting object at regular time intervals, forming a statistical sequence that changes over time, establishing a corresponding model of data changes over time, and extrapolating that model into the future for forecasting. Applicable Scope: This method is effective under the premise that past development patterns will continue into the future, making it suitable for short-term forecasts but not for medium to long-term forecasts.Generally, if the influencing factors of the forecasting object do not undergo dramatic changes, using time series analysis can yield good forecasting results; if these factors do change dramatically, the results of the time series method will be affected.

4.3 Grey Forecasting Method

Basic idea: Treat all random variables as grey variables changing within a certain range, not conducting large sample analyses from a statistical perspective, but using data processing methods (data generation and restoration) to organize chaotic original data into more regular generated data for research, establishing a model based not on original data but on generated data. Applicable Scope: The forecasting model is an exponential function; if the measured quantity develops according to a certain exponential law, high-precision forecasting results can be expected. Key factors affecting the accuracy and adaptability of the model’s predictions are the construction of background values and the selection of initial values in the forecasting formula.

4.4 BP Neural Network Method

The theory of artificial neural networks has the ability to represent any nonlinear relationship and learning, providing new ideas and methods for solving many practical problems with complexity, uncertainty, and time variance. By leveraging the learning function of artificial neural networks, a large number of samples are used to train the neural network, adjusting connection weights and biases, and then the established model can be used for forecasting.Neural networks can automatically learn from data samples without complex querying and expression processes and can automatically approximate the functions that best describe the patterns in sample data, regardless of the form of these functions. The more complex the function forms of the systems considered, the more pronounced the effects of this characteristic of neural networks. The basic idea of the error backpropagation algorithm (BP algorithm) is to adjust and modify the network’s connection weights and biases through the backpropagation of network errors to minimize errors; its learning process includes forward computation and error backpropagation.Using a simple three-layer artificial neural network model can achieve any complex nonlinear mapping relationship between input and output.Currently, neural network models have been successfully applied in many fields, such as economic forecasting, financial analysis, mortgage assessment, and bankruptcy prediction in various economic areas. Advantages: They can mimic the structure and functions of the human brain’s neural system for information processing and retrieval at various levels, exhibiting strong adaptive capabilities for many non-structured and imprecise patterns, with characteristics such as information memory, autonomous learning, knowledge reasoning, and optimization computing. Their self-learning and adaptive features are not possessed by conventional algorithms and expert system technologies, while also overcoming difficulties in expressing random and non-quantitative factors mathematically. Disadvantages: Determining network structure is difficult, requiring sufficient historical data, and sample selection is challenging; the algorithms are complex and prone to local minima.

4.5 Support Vector Machine Method

Support vector machines are a machine learning method based on statistical learning, seeking to minimize structural risk to achieve good statistical regularities even with a small number of statistical samples. Support vector machines are the core and focus of statistical learning theory, approximating the principle of minimizing structural risk, enhancing the generalization ability of learning machines, allowing small errors from limited training samples, while ensuring small errors remain with independent test sets. Additionally, the support vector machine algorithm is a convex optimization problem, hence local optimal solutions are guaranteed to be global optimal solutions, overcoming the slow convergence and local minima issues of neural networks. Selecting kernel functions in SVM methods is a challenging issue, with no established theoretical guidance available to date.

4.6 Combined Forecasting Method

In practical forecasting work, from the perspective of information utilization, any single forecasting method only utilizes part of the useful information while discarding other useful information.To fully leverage the advantages of various forecasting models, multiple forecasting methods are often employed for the same forecasting problem.Different forecasting methods can provide various useful information, and combined forecasting integrates different forecasting models in a specific manner.According to the combination theorem, various forecasting methods can utilize all available information through combination, improving forecasting accuracy and enhancing forecasting performance. Optimized combined forecasting includes two concepts: one refers to a forecasting method that selects appropriate weights to perform weighted averages of the results obtained from several forecasting methods; the key is determining the weighted coefficients for each individual forecasting method; the second refers to comparing several forecasting methods to select the model with the best fit or the smallest standard deviation as the optimal model for forecasting.Combined forecasting plays a role when a single forecasting model cannot fully accurately describe the changing patterns of the forecasting quantity.

School Mathematical Modeling updates the results list in real-time

Click the public account card below to follow

Don’t miss the full score of the leading mathematics public account

Tap me Tap me

↓↓

Warm Reminder: The WeChat public account information stream has been revamped, and each user can set frequently read subscription accounts, which will be displayed in large card format. Therefore, if you don’t want to miss articles from “School Mathematical Modeling,” and want to receive national competition information, answer inquiries, you must perform the following operations: Enter the “School Mathematical Modeling” public account → Click the top right corner ··· menu → Select ‘Set as Starred’

Summary of Four Major Mathematical Modeling Models