双臂魔方还原机器人设计与实现文献综述

 2023-04-14 11:04

文献综述

1.1 Introduction: The goal of this research is represented some kind of suitable function that will help us to get the evolutionary and hill-climbing algorithms. The function also known as like error functions, loss functions, objective functions are a key parameter of most approaches to machine leaning aspects. These are problem what is specify the scientific functions that how long a point in the search space comes real solution closer to hand.Due to circumstances solving the Rubiks cube on of the Deep Reinforcement Learning. Rubiks cube has enormous combination involving approximately 4.3x1019 possibilities to the configurations. It s mathematical approaches provided by the Rubiks team. That define how the layers rotate in search space. When it is solved configuration the cube, as a geometrical object show symmetries which are broken when driven away from this configuration. When all possible symmetric are gained, the configuration of Rubiks cube matches the solution of game. We make use of a Deep Reinforcement Learning algorithm based on possible configurations. This error problem mainly studies the application of deep reinforcement learning to the restoration and combination optimization of Rubiks cube. Third order(3x3) Rubiks cube is a classical combinational optimization problem. Due to target state, most of the current reduction algorithms are based on human knowledge, for instance group theory and algebraic abstraction. At present, deep learning and reinforcement learning are in a period of rapid development, and the deep reinforcement learning produce by combination of the two is even more effective. The machine war between Alpha go and Li Shi in 2016 left deep impression on people. Therefore, this topic hopes to solve the problem of space return of traditional deep reinforcement learning such as Alpha go, Rubiks cube. Finally, the algorithm is implanted on the double arm magic cube robot. While verifying the algorithm, the algorithm is optimized two in aspects: the success rate and the number of steps. By accumulating these ideas and scientific aspects will improve to search and optimization algorithms in many different problems.-The Rubiks cube consists of 26 smaller cubes called cubelets. These are classified by their stickers, count: center, edge, and corner cubelets have 1, 2, and 3 stickers attached respectively. There are 54 stickers in total with each uniquely identifiable based on the of cubelets the sticker is on and other on the cubelet. 1.2 Research Literature Review:There are plenty of research and literature of this topic. The best way to improve of this project need to review them as much as I can. I concern that obtain these formulations and ideas will help me finish my project. So I would like to show you few Researches and literatures work here. 1.2.1 Solving the Rubiks cube via quantum mechanics of deep reinforcement learning by Corli, Sebastino, Et al from Italy made great contribution it.Figure 1: (a) the different kind of cubies composing the Cube: in light blue, in pink and in grey the edges, the corners and the centrals respectively. The arrows define the arbitrary choice of orientation for the solved state. Two different colors are applied for corners (pink) and edges (blue). (b) The faces of the Cube with their conventional color blue, red and white. Coordinated axes are introduced to describe the positions of the cubies, and the choice for the orientation in the solved configuration for the sole corners. (c) The same coordinated axes to describe the position of the cubies and the choice of orientation for the edges in the solved configuration.1.2.2 Solving the Rubiks cube stepwise deep learning by Colin G. Jonson. He wanted to express by solving the Rubiks cube algorithms come up with difficult problems solution. The way he thinks that make another approach of deep reinforcement learning algorithm. Now is time to check his research field. In Johnson (2018) we applied the LGF (Learned Guidance Function) to the problem of unscrambling the Rubiks Cube. We used a number of classifiers from the scikit-learn library (scikit-learn, n.d.) to implement LGFs, and demonstrated that (1) the LGF can learn to recognize the number of turns that have been made to a cube to a decent level of accuracy; and, (2) that this LGF can then be used to unscramble particular states of the cube in a sensible number of moves. Unscrambling is not one of the non-oracular problems, because the goal state is known, but it has a complex fitness landscape with many local minima, and so is a good test for these kinds of algorithms. The search space C consists of all possible configurations of colored face lets on the six faces of the cube, each of which has a 3 3 set of face lets. The move set M is notated by a list of twelve 90_ moves, (Singmaster, 1981), which are functions from C to C. We use the notation m(c) to denote the application of move, M to the cube c, returning the new state of the cube.An earlier paper by the author (Johnson, 2018) applied a number of learning algorithms to the problem of learning LGFs for the Rubiks cube, with random forests demonstrating itself to be the best approach. That paper did not use any deep learning (Goodfellow et al., 2017) approachesin this paper, we extend the work by using deep learning.The Cube is composed by 26 smaller cubes, called cubies. Six of them show one face (the centrals), which cannot be moved from their relative positions, 12 edges show two faces and 8 corners show three faces (the corners). A visual description is shown in Figure 1. A corner cubie cannot take the place of an edge one, and vice versa. Any configuration of the Cube can be univocally identified by two sets of features, namely the positions and orientations of the cubies. The position of a cubie marks how far a cubie is from its place in the solved configuration. The orientation of a cubie stores how, keeping the cubie in its solved position, it has been rotated around a rotation axis suitable to induce permutations. Orientation is graphically represented as an arrow on a face of the cubie, as in Figure 1. Orientations and positions are modified by rotating the layers of the Cube. These transformations are elements from Rubiks group (R), which consists of six generators, each of them corresponding tothe rotation of a colored face. where U, D are the rotations of the upper and lower faces, F, B front, back, L, R left, right respectively. To describe the orientation and position of each cubie, one may set an orientation for the solved configuration and three cartesian axes (Figure 1). The geometry of orientation is affected by the number of faces displayed by the cubie. The ways in which a cubie can be oriented are the way the same cubie can be put in a fixed position by rotating its own faces. Any allowed configuration can be mapped from the solved one, once an algebraic representation for the Rubiks group is given. The six cubies fixed in the middle of the facesare called centrals. They are invariant under all the transformations and show just one face, thus they are not associated to any specific orientation. Edge cubies show two faces, thus two1.3 Implication:We are currently further improving Deep Cube by extending it to harder cubes. Autodidactic Iteration can be used to train a network to solve a 3x3 cube and other puzzles such as n-dimensional sequential move puzzles and combination puzzles involving other poly-complexities. Besides further work with the Rubiks Cube, we are working on extending this method to find approximate solutions to other combinatorial optimization problems such as prediction of protein tertiary structure. Many combinatorial optimization problems can be thought of as sequential decisions making problems, in which case we can use reinforcement learning. Bello et. al. train an RNN through policy gradients to solve simple traveling salesman and knapsack problems . We believe that harnessing search will lead to better reinforcement learning approaches for combinatorial optimization. For example, in protein folding, we can think of sequentially placing each amino acid in a 3D lattice at each timestep. If we have a model of the environment, ADI can be used to train a value function which looks at a partially completed state and predicts the future reward when finished. This value function can then be combined with MCTS (Morte Carlio Tree System) to find approximately optimal conformations. Lon Bottou defines reasoning as 'algebraically manipulating previously acquired knowledge in order to answer a new question'[5]. Many machine learning algorithms do not reason about problems but instead use pattern recognition to perform tasks that are intuitive to humans, such as object recognition. By combining neural networks with symbolic AI, we are able to create algorithms which are able to distill complex environments into knowledge and then reason about that knowledge to solve a problem. DeepCube is able to teach itself how to reason in order to solve a complex environment with only one reward state using pure reinforcement learning.1.2 Research Literature ReviewThere are plenty of research and literature of this topic. The best way to improve of this project need to review them as much as I can. I concern that obtain these formulations and ideas will help me finish my project. So I would like to show you few Researches and literatures work here. 1.2.1 Solving the Rubiks cube via quantum mechanics of deep reinforcement learning by Corli, Sebastino, Et al from Italy made great contribution it.Figure 1: (a) the different kind of cubies composing the Cube: in light blue, in pink and in grey the edges, the corners and the centrals respectively. The arrows define the arbitrary choice of orientation for the solved state. Two different colors are applied for corners (pink) and edges (blue). (b) The faces of the Cube with their conventional color blue, red and white. Coordinated axes are introduced to describe the positions of the cubies, and the choice for the orientation in the solved configuration for the sole corners. (c) The same coordinated axes to describe the position of the cubies and the choice of orientation for the edges in the solved configuration.1.2.2 Solving the Rubiks cube stepwise deep learning by Colin G. Jonson. He wanted to express by solving the Rubiks cube algorithms come up with difficult problems solution. The way he thinks that make another approach of deep reinforcement learning algorithm. Now is time to check his research field. In Johnson (2018) we applied the LGF (Learned Guidance Function) to the problem of unscrambling the Rubiks Cube. We used a number of classifiers from the scikit-learn library (scikit-learn, n.d.) to implement LGFs, and demonstrated that (1) the LGF can learn to recognize the number of turns that have been made to a cube to a decent level of accuracy; and, (2) that this LGF can then be used to unscramble particular states of the cube in a sensible number of moves. Unscrambling is not one of the non-oracular problems, because the goal state is known, but it has a complex fitness landscape with many local minima, and so is a good test for these kinds of algorithms. The search space C consists of all possible configurations of colored face lets on the six faces of the cube, each of which has a 3 3 set of face lets. The move set M is notated by a list of twelve 90_ moves, (Singmaster, 1981), which are functions from C to C. We use the notation m(c) to denote the application of move, M to the cube c, returning the new state of the cube.An earlier paper by the author (Johnson, 2018) applied a number of learning algorithms to the problem of learning LGFs for the Rubiks cube, with random forests demonstrating itself to be the best approach. That paper did not use any deep learning (Goodfellow et al., 2017) approachesin this paper, we extend the work by using deep learning. The Cube is composed by 26 smaller cubes, called cubies. Six of them show one face (the centrals), which cannot be moved from their relative positions, 12 edges show two faces and 8 corners show three faces (the corners). A visual description is shown in Figure 1. A corner cubie cannot take the place of an edge one, and vice versa. Any configuration of the Cube can be univocally identified by two sets of features, namely the positions and orientations of the cubies. The position of a cubie marks how far a cubie is from its place in the solved configuration. The orientation of a cubie stores how, keeping the cubie in its solved position, it has been rotated around a rotation axis suitable to induce permutations. Orientation is graphically represented as an arrow on a face of the cubie, as in Figure 1. Orientations and positions are modified by rotating the layers of the Cube. These transformations are elements from Rubiks group (R), which consists of six generators, each of them corresponding tothe rotation of a colored face. where U, D are the rotations of the upper and lower faces, F, B front, back, L, R left, right respectively. To describe the orientation and position of each cubie, one may set an orientation for the solved configuration and three cartesian axes (Figure 1). The geometry of orientation is affected by the number of faces displayed by the cubie. The ways in which a cubie can be oriented are the way the same cubie can be put in a fixed position by rotating its own faces. Any allowed configuration can be mapped from the solved one, once an algebraic representation for the Rubiks group is given. The six cubies fixed in the middle of the facesare called centrals. They are invariant under all the transformations and show just one face, thus they are not associated to any specific orientation. Edge cubies show two faces, thus two1.3 Implication:We are currently further improving Deep Cube by extending it to harder cubes. Autodidactic Iteration can be used to train a network to solve a 3x3 cube and other puzzles such as n-dimensional sequential move puzzles and combination puzzles involving other poly-complexities. Besides further work with the Rubiks Cube, we are working on extending this method to find approximate solutions to other combinatorial optimization problems such as prediction of protein tertiary structure. Many combinatorial optimization problems can be thought of as sequential decisions making problems, in which case we can use reinforcement learning. Bello et. al. train an RNN through policy gradients to solve simple traveling salesman and knapsack problems . We believe that harnessing search will lead to better reinforcement learning approaches for combinatorial optimization. For example, in protein folding, we can think of sequentially placing each amino acid in a 3D lattice at each timestep. If we have a model of the environment, ADI can be used to train a value function which looks at a partially completed state and predicts the future reward when finished. This value function can then be combined with MCTS (Morte Carlio Tree System) to find approximately optimal conformations. Lon Bottou defines reasoning as 'algebraically manipulating previously acquired knowledge in order to answer a new question'[5]. Many machine learning algorithms do not reason about problems but instead use pattern recognition to perform tasks that are intuitive to humans, such as object recognition. By combining neural networks with symbolic AI, we are able to create algorithms which are able to distill complex environments into knowledge and then reason about that knowledge to solve a problem. DeepCube is able to teach itself how to reason in order to solve a complex environment with only one reward state using pure reinforcement learning.

资料编号:[576938]

剩余内容已隐藏,您需要先支付 10元 才能查看该篇文章全部内容!立即支付

以上是毕业论文文献综述,课题毕业论文、任务书、外文翻译、程序设计、图纸设计等资料可联系客服协助查找。