Latvian State Forests, together with Scandic Fusion, embarked on an ambitious project to improve the accuracy and reliability of its forestry output predictions.
The goal was to estimate the types and quantities of forestry products that could be harvested from specific areas within the forest, using a variety of attributes as inputs. These predictions were crucial for the organization’s operations, as they informed decisions across multiple facets of the business.
The ability to forecast forestry output with precision is of strategic importance to Latvian State Forests for several reasons:
- Customer Commitments: Latvian State Forests’ customers rely on accurate information about what products will be available and when they can expect delivery. These commitments are based on predictions about the types and quantities of products, making accuracy critical to maintaining customer satisfaction and trust.
- Strategic Harvesting Decisions: The predictions also play a vital role in determining which areas of the forest should be harvested and when. This decision-making process ensures that resources are utilized optimally, balancing various considerations.
- Logistics and Routing Optimization: Accurate output predictions enable more efficient planning of transportation routes for harvested logs. By optimizing the logistics of log transportation, Latvian State Forests can reduce costs, minimize environmental impact, and improve overall operational efficiency.
- Supply Chain Integration: The ability to forecast possible harvest is an integral part of the broader supply planning process. This process is essential for optimizing operations across the organization, ensuring that the right products are available at the right time to meet demand.
During this project, machine learning techniques were employed to forecast the types and quantities of products that could be harvested from specific forest areas. The predictions covered a range of standard forestry products, e.g. firewood, logs of various specific dimensions, and many others, for each trees species in the specific forest area.
Project Implementation
The project’s implementation required the integration of various data sources and the application of advanced machine learning techniques. The inputs for the harvest predictions included:
- Historical Data from Harvesters: The actual data collected by harvesters during previous harvests provided a valuable historical baseline against which new predictions could be measured.
- Forest Taxonomy Data: This included generic descriptions of the forest areas, such as age, humidity, soil type, and other environmental factors that could influence the type and quantity of forestry products.
- Pre-Harvest Measurements: Detailed measurements and samples taken immediately before harvesting provided current, on-the-ground data that was critical for accurate predictions.
The technology stack for this project centered around the R programming language, chosen for its robust data processing capabilities and versatility in statistical computing. Azure ML Studio was used to facilitate the model selection process.
Approach to Model Selection
- Data Preparation: The first step involved gathering the relevant historical and pre-harvest data.
- Initial candidate model selection: More than 20 different machine learning models were automatically tested to identify those with the highest potential for accurate predictions.
- Model Refinement: The top 3-5 models were selected for further manual fine-tuning. After rigorous testing and evaluation, the XGBoost model emerged as the most effective and was chosen for the final implementation.
Although the project was initially planned to proceed through six iterations, it ultimately required twelve. This increase was due to several challenges, including data quality issues, the emergence of new contextual information during the project, and the discovery of manual interventions in the source data that had previously gone under the radar.
Outcomes of the Project
The project resulted in a significant transformation of Latvian State Forests’ forecasting process. Previously, the organization relied on Excel-based models, recalculating coefficients regularly to update a legacy prediction system. This old process was time-consuming and lacked the flexibility to adapt to changing conditions in real-time.
In contrast, the new process leverages the latest data from the organization’s data warehouse. After each data refresh, the machine learning model automatically forecasts the output for all forest areas slated for cutting but not yet harvested. These predictions provide a critical input into the decision-making process, enabling Latvian State Forests to make more informed and timely decisions about which areas to harvest.
A comparison of the new model’s predictions with actual outcomes showed that it significantly outperformed the legacy system for high-volume and high-value-added products. The new model’s precision has improved Latvian State Forests’ ability to meet customer expectations and optimize its operations.
Lessons Learned
The project provided several important lessons that will inform future initiatives:
- Defining Success Metrics: One of the key challenges encountered during the project was defining an appropriate metric for success. Forecasting the output of a forest is inherently complex, and the team initially struggled to find a metric that accurately reflected the model’s precision.
- Prioritizing Product Accuracy: Not all products are of equal importance when it comes to prediction accuracy. For Latvian State Forests, it was particularly crucial to ensure high precision for high-volume and high-value-added products. Even if this focus on key products came at the cost of reduced accuracy for less critical outputs, the trade-off was necessary for optimizing overall business outcome.
- Understanding Client Operations: The project also highlighted the importance of understanding the full scope of client operations that are relevant to the initiative. During the project, significant prediction errors for certain products were traced back to manual interventions in historical harvest orders. These interventions, which had not been initially disclosed, were based on specific changes in historic data made after the fact of harvesting the forest. This experience reinforced the need for thorough communication with clients to uncover any potential anomalies and tweaks in the data, to make sure they can be taken into account and handled in the best way taking into account the goals of the initiative.
- The Role of Effective Project Management: Managing a complex project with multiple iterations and unforeseen challenges required strong project management. The ability to manage expectations, keep the project on track, and adapt to new information as it emerged was crucial to the project’s success.
- The Value of Domain Expertise: Finally, the project underscored the importance of having a knowledgeable business analyst on the team. Understanding the intricacies of the forestry process and how the predictions would be used in practice was essential for guiding the project and ensuring that the final model met the organization’s needs.
As Latvian State Forests continues to innovate, this project stands as a testament to the transformative potential of machine learning in the forestry sector. At Scandic Fusion, we are proud to have played a key role in bringing this project to life.