Objects
Wang, Weijing, Chen, Junjie, Xu, Zhangwei, Dang, Yingnong, Zhang, Dongmei, Yang, Lin, Zhang, Hongyu, Zhao, Pu, Qiao, Bo, Kang, Yu, Lin, Qingwei, Rajmohan, Saravanakumar, Gao, Feng. Institute of Electrical and Electronics Engineers (IEEE); 2021. How long will it take to mitigate this incident for online service systems?.
Zhang, Xu, Lin, Qingwei, Wu, Youjiang, Hsieh, Ken, Sui, Kaixin, Meng, Xin, Xu, Yaohai, Zhang, Wenchi, Shen, Furao, Zhang, Dongmei, Xu, Yong, Qin, Si, Zhang, Hongyu, Qiao, Bo, Dang, Yingnong, Yang, Xinshen, Cheng, Qian, Chintalapati, Murali. USENIX Association; 2019. Cross-dataset time series anomaly detection for cloud systems.
Zhang, Xu, Xu, Yong, Lin, Qingwei, Qiao, Bo, Zhang, Hongyu, Dang, Yingnong, Xie, Chunyu, Yang, Xinsheng, Cheng, Qian, Li, Ze, Chen, Junjie, He, Xiaoting, Yao, Randolph, Lou, Jian-Guang, Chintalapati, Murali, Shen, Furao, Zhang, Dongmei. Association for Computing Machinery; 2019. Robust log-based anomaly detection on unstable log data.
Jiang, Jiajun, Lu, Weihai, Chen, Junjie, Lin, Qingwei, Zhao, Pu, Kang, Yu, Zhang, Hongyu, Xiong, Yingfei, Gao, Feng, Xu, Zhangwei, Dang, Yingnong, Zhang, Dongmei. Association for Computing Machinery; 2020. How to mitigate the incident? An effective troubleshooting guide recommendation technique for online service systems.
Chen, Junjie, Zhang, Shu, He, Xiaoting, Lin, Qingwei, Zhang, Hongyu, Hao, Dan, Kang, Yu, Gao, Feng, Xu, Zhangwei, Dang, Yingnong, Zhang, Dongmei. Institute of Electrical and Electronics Engineers (IEEE); 2020. How incidental are the incidents? Characterizing and prioritizing incidents for large-scale online service systems.
Chen, Xiangning, Lin, Qingwei, Luo, Chuan, Li, Xudong, Zang, Hongyu, Xu, Yong, Dang, Yingnong, Sui, Kaixin, Zhang, Xu, Qiao, Bo, Zhang, Weiyi, Wu, Wei, Chintalapati, Murali, Zhang, Dongmei. Institute of Electrical and Electronics Engineers (IEEE); 2019. Neural feature search: a neural architecture for automated feature engineering.
Gu, Jiazhen, Luo, Chuan, Qin, Si, Qiao, Bo, Lin, Qingwei, Zhang, Hongyu, Li, Ze, Dang, Yingnong, Cai, Shaowei, Wu, Wei, Zhou, Yangfan, Chintalapati, Murali, Zhang, Dongmei. Association for Computing Machinery; 2020. Efficient incident identification from multi-dimensional issue reports via meta-heuristic search.
Chen, Zhuangbin, Kang, Yu, Li, Liqun, Zhang, Xu, Zhang, Hongyu, Xu, Hui, Zhou, Yangfan, Yang, Li, Sun, Jeffrey, Xu, Zhangwei, Dang, Yingnong, Gao, Feng, Zhao, Pu, Qiao, Bo, Lin, Qingwei, Zhang, Dongmei, Lyu, Michael R.. Association for Computing Machinery (ACM); 2020. Towards Intelligent Incident Management: Why We Need It and How We Make It.
Wang, Lu, Zhao, Pu, Zhang, Hongyu, Rajmohan, Saravan, Zhang, Dongmei, Du, Chao, Luo, Chuan, Su, Mengna, Yang, Fangkai, Liu, Yudong, Lin, Qingwei, Wang, Min, Dang, Yingnong. Association for Computing Machinery; 2022. NENYA: Cascade Reinforcement Learning for Cost-Aware Failure Mitigation at Microsoft 365.
Liu, Yudong, Yang, Hailan, Zhang, Chenjian, Wang, Paul, Dang, Yingnong, Rajmohan, Saravan, Zhang, Dongmei, Zhao, Pu, Ma, Minghua, Wen, Chengwu, Zhang, Hongyu, Luo, Chuan, Lin, Qingwei, Yi, Chang, Wang, Jiaojian. Association for Computing Machinery; 2022. Multi-task Hierarchical Classification for Disk Failure Prediction in Online Service Systems.